# Face Recognition and AI Services

### Step RE1: Face detect and insights

***The step below will import all necessary libraries throughout this lab***

In [None]:
#Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-developer-guide/blob/master/LICENSE-SAMPLECODE.)

!pip install tabulate

import boto3
import json
import urllib
import time
import re
import tabulate
from io import BytesIO
from IPython.display import HTML, Audio
from PIL import Image as Img
import matplotlib.pyplot as plt
import matplotlib.patches as patches

rekognition=boto3.client('rekognition')

***Let's take Salah's image as an example. He is a football player, playing for Liverpool FC***    
***Sorry if you support another team, but do not take it personally :)***

In [None]:
# image source = https://www.liverpoolfc.com/news/first-team/277625-mohamed-salah-i-can-t-explain-how-it-feels-it-s-a-dream-come-true
salah_a_url = 'https://d3j2s6hdd6a7rg.cloudfront.net/v2/uploads/media/default/0001/50/thumb_49256_default_news_size_5.jpeg'

In [None]:
HTML(data='<figure style="float:left;"><img src="{}" alt="Source" width="200"/><figcaption ><center>Salah</center></figcaption></figure>'.format(salah_a_url))

In [None]:
salah_a = urllib.request.urlopen(salah_a_url)
detect_emotion_response = rekognition.detect_faces(
    Image={
        'Bytes': salah_a.read()
    },Attributes=['ALL']
)
print(json.dumps(detect_emotion_response, indent=4))

***Extract some interesting insight from that output***

In [None]:
print("Age Range: {}".format(detect_emotion_response['FaceDetails'][0]['AgeRange']))
print("Gender: {}".format(detect_emotion_response['FaceDetails'][0]['Gender']))
print("Smiling?: {}".format(detect_emotion_response['FaceDetails'][0]['Smile']))
print("Has beard?: {}".format(detect_emotion_response['FaceDetails'][0]['Beard']))
print("Emotions: {}".format(json.dumps(detect_emotion_response['FaceDetails'][0]['Emotions'], indent=4)))

### Step RE2: (Optional Challenge) How to be sad, angry, confused, and surprised?

***Now, you can try uploading your own image***    
***The challenge is how to set our face, such that rekognition will say acknowledge that we are SAD with confidence of at least 90%***    
***Repeat this with other emotions: confused, calm, surpirse, etc***   

This is how you can upload the image. On another tab, just open the same Jupyter Notebook page. Navigate to directory machine-learning-workshop/face-recognition-and-ai-services/images, and click on the Upload button.     

***Change the photo path to the path of your uploaded image.***

In [None]:
my_photo_path = 'images/yudho.jpg' # CHANGE HERE as appropriate

HTML(data='<figure style="float:left;"><img src="{}" alt="Me" width="200"/><figcaption ><center>Me</center></figcaption></figure>'.format(my_photo_path))

In [None]:
my_photo = open(my_photo_path, 'rb')
detect_emotion_response = rekognition.detect_faces(
    Image={
        'Bytes': my_photo.read()
    },Attributes=['ALL']
)
print(json.dumps(detect_emotion_response['FaceDetails'][0]['Emotions'], indent=4))

### Step RE3: Try face compare

***Define the compare_faces method first***

In [None]:
def compare_faces(imageSource, imageTarget):

    response=rekognition.compare_faces(SimilarityThreshold=70,
                                  SourceImage={'Bytes': imageSource.read()},
                                  TargetImage={'Bytes': imageTarget.read()})
    
    if not response['FaceMatches']:
        print('No face match found')
        return False
    else:
        for faceMatch in response['FaceMatches']:
            position = faceMatch['Face']['BoundingBox']
            similarity = faceMatch['Similarity']
            print('Matched face found with ' + str(round(similarity,2)) + '% confidence\n' +
                  'Location in target image: {left:' +
                   str(round(position['Left'],2)) + ',top:' +
                   str(round(position['Top'],2)) + ',height:' +
                   str(round(position['Height'],2)) + ',width:' +
                   str(round(position['Width'],2)) + '}')
            details = {
                'confidence': similarity,
                'left': position['Left'],
                'top': position['Top'],
                'height': position['Height'],
                'width': position['Width']
            }
            return details
                   
 
    imageSource.close()
    imageTarget.close()


***Let's fetch another image of Salah for the face comparison***

In [None]:
# image source = https://www.liverpoolfc.com/news/first-team/339044-mohamed-salah-manchester-united
salah_b_url = 'https://d3j2s6hdd6a7rg.cloudfront.net/v2/uploads/media/default/0001/82/thumb_81992_default_news_size_5.jpeg'

HTML(data='<figure style="float:left;"><img src="{}" alt="Source" width="300"/><figcaption ><center>Source</center></figcaption></figure><figure style="float:right;"><img src="{}" alt="Target" width="220"/><figcaption><center>Target</center></figcaption></figure>'.format(salah_a_url, salah_b_url))

***Now let's compare the two images of Salah***    
Notice that the target image has the face in different angle, and in different expression

In [None]:
salah_a = urllib.request.urlopen(salah_a_url)
salah_b = urllib.request.urlopen(salah_b_url)
compare_1_result = compare_faces(salah_a, salah_b)

***Salah plays with Liverpool FC. Will Rekognition be able to identify him in a picture of Liverpool players?***

In [None]:
# image source = https://www.dailystar.co.uk/sport/football/631810/Liverpool-celebrate-Premier-League-Asia-Trophy-Leicester-Hong-Kong-sportgalleries
liverpool_url = 'https://cdn.images.dailystar.co.uk/dynamic/122/photos/880000/900x738/1017880.jpg'

HTML(data='<figure style="float:left;"><img src="{}" alt="Source" width="300"/><figcaption ><center>Source</center></figcaption></figure><figure style="float:right;"><img src="{}" alt="Target" width="220"/><figcaption><center>Target</center></figcaption></figure>'.format(salah_a_url, liverpool_url))

In [None]:
salah_a = urllib.request.urlopen(salah_a_url)
liverpool = urllib.request.urlopen(liverpool_url)
compare_2_result = compare_faces(salah_a, liverpool)

***Now, let's draw a rectangle on the matched face, based on the bounding box information returned***

In [None]:
# Get image sizes
liverpool = urllib.request.urlopen(liverpool_url)
file = BytesIO(liverpool.read())
width, height = Img.open(file).size

# Load image to plot
liverpool = urllib.request.urlopen(liverpool_url)
liverpool_image = plt.imread(liverpool, format='jpg')

# Get figure and axes
fig,ax = plt.subplots()
fig.set_size_inches(width/fig.dpi*0.8,height/fig.dpi*0.8)

# Get bounding box details
bounding_box = compare_2_result

ax.imshow(liverpool_image)

# Create a Rectangle patch
x = bounding_box['left']*width
y = bounding_box['top']*height
w = bounding_box['width']*width
h = bounding_box['height']*height
rect = patches.Rectangle((x,y),w,h,linewidth=3,edgecolor='y',facecolor='none')

# Add the bounding box
ax.add_patch(rect)

plt.show()

***What if we present a lineup picture of Bayern Munich players? It should not find Salah in there...***

In [None]:
# image source = https://fcbayern.com/en/club/honours/all-honours
bayern_url = 'https://fcbayern.com/binaries/content/gallery/fc-bayern/homepage/club/erfolge/meisterschaft/2016_header.jpg'

HTML(data='<figure style="float:left;"><img src="{}" alt="Source" width="200"/><figcaption ><center>Source</center></figcaption></figure><figure style="float:right;"><img src="{}" alt="Target" width="200"/><figcaption><center>Target</center></figcaption></figure>'.format(salah_a_url, bayern_url))

In [None]:
salah_a = urllib.request.urlopen(salah_a_url)
bayern = urllib.request.urlopen(bayern_url)
compare_3_result = compare_faces(salah_a, bayern)

### Step RE4: Photo-ID

***Amazon Rekognition can recognize face in photo ID too, that could be part of real use-case feature***

In [None]:
photo_id_path = 'images/yudho-card.jpg'
person_path = 'images/yudho.jpg'

HTML(data='<figure style="float:left;"><img src="{}" alt="Source" width="200"/><figcaption ><center>Source</center></figcaption></figure><figure style="float:right;"><img src="{}" alt="Source" width="120"/><figcaption ><center>Target</center></figcaption></figure>'.format(photo_id_path, person_path))




In [None]:
person = open(person_path,'rb')
photo_id = open(photo_id_path, 'rb')
compare_4_result = compare_faces(person, photo_id)

### Step RE5: Detect Text

***Can we extract the text in photo ID? Let's give it a try***

In [None]:
photo_id = open(photo_id_path, 'rb')

def detect_text(image):

    response = rekognition.detect_text(Image=
        {'Bytes': image.read()}
    )
    if len(response['TextDetections']) == 0:
        print('No Text Found')
    else:
        texts = []
        for text_item in response['TextDetections']:
            if text_item['Type'] == 'LINE':
                texts.append(text_item['DetectedText'])
        texts = '\n'.join(str(x) for x in texts)
        print(texts)
        return(texts)

# Call Amazon Rekognition detect_text API
text_detect_result = detect_text(photo_id)


***You can also use Amazon Textract specific for text extraction from image.***    
The service is currently on preview. Do sign-up to get whitelisted

### Step RE6: Discover data from text

***Wouldn't it be interesting to see an AI trying to understand the text and come up with meaningful information?***    
***This AI service is called Amazon Comprehend***

In [None]:
comprehend = boto3.client('comprehend')

response = comprehend.detect_entities(
    Text=text_detect_result,
    LanguageCode='en'
)

print(json.dumps(response['Entities'], indent=4, sort_keys=True))

***Cool, but there might be only few entities found***    
***Let's try to discover data from a more comprehensive text***

In [None]:
# Customer review source = https://www.amazon.com/Mongoose-Dolomite-Mountain-26-Inch-Wheels/product-reviews/B01N2Z117A/ref=cm_cr_arp_d_paging_btm_next_2?ie=UTF8&reviewerType=all_reviews&pageNumber=2
review_file = open('texts/customer_review_example.txt','r')
review = review_file.read()
print(review)

In [None]:
# Call comprehend detect_entities API
response = comprehend.detect_entities(
    Text=review,
    LanguageCode='en'
)

# Display in table format
text_table = ['TEXT']
score_table = ['CONFIDENCE SCORE']
entity_type_table = ['ENTITY_TYPE']
for entity in response['Entities']:
    text_table.append(entity['Text'])
    score_table.append(entity['Score'])
    entity_type_table.append(entity['Type'])
comprehend_table = [text_table, entity_type_table, score_table]
display(HTML(tabulate.tabulate(comprehend_table, tablefmt='html')))

***We can infer the sentiment of the text too using Comprehend***

In [None]:
response = comprehend.detect_sentiment(
    Text=review,
    LanguageCode='en'
)
print(json.dumps(response, indent=4))

***Note that the sentiment is overall POSITIVE, but it does have several other sentiments with low score. This is aligned with the product rating given by this customer, which is 4 out of 5***     

***Try to read the review above and you may understand why the sentiment is not 100% positive***

### Step RE7: Read it out loud

***Let's go back to the photo ID and read out the result of text detection with Amazon Polly***

In [None]:
polly = boto3.client('polly')

def synthesize_speech(text,text_type):
    response = polly.synthesize_speech(
        OutputFormat='mp3',
        SampleRate='16000',
        Text=text,
        TextType=text_type,
        VoiceId='Matthew',
        LanguageCode='en-US'
    )
    return response
    
# Call Polly API
voice_synthesis_result = synthesize_speech(text_detect_result,'text')

# Save result to file
file_name = 'photo_id_info.mp3'
file = open('voices/{}'.format(file_name), 'wb')
file.write(voice_synthesis_result['AudioStream'].read())
file.close()


Audio(filename='voices/{}'.format(file_name))
    

***Cool, but it kinda sounds weird. Let's try to improve it with SSML techniques***    
These are techniques that we will use:    
1. Add 1 second break after every line
2. Add 1 second break after each word of the person's name

In [None]:
# Change text format to SSML and add <break> after every line to make 
ssml_text = text_detect_result.replace('\n','<break time="2s"/>')
ssml_text = '<speak>{}</speak>'.format(ssml_text)

# Get the names and add 1s break for each word in the name
names = text_detect_result.split('\n')[2].split(' ')
for name in names:
    ssml_text = re.sub(r'(' + name + ')', r'\1<break time="1s"/>', ssml_text)

print(ssml_text)

# Call Polly API
voice_synthesis_result = synthesize_speech(ssml_text,'ssml')


# Save result to file
file_name = 'photo_id_info.mp3'
file = open('voices/{}'.format(file_name), 'wb')
file.write(voice_synthesis_result['AudioStream'].read())
file.close()


Audio(filename='voices/{}'.format(file_name))

### Step RE8: Transcribe it back to text

***This is a redundant step, I know, but the purpose is to try the transcription from voice to text using Amazon Transcribe***

In [None]:
transcribe = boto3.client('transcribe')

job_name = "photo_id_info"
audio_location = 'voices/{}'.format(file_name)

def start_transcription_job(job_name):
    sts = boto3.client('sts')
    s3 = boto3.client('s3')
    
    account_id = sts.get_caller_identity().get('Account')
    bucket_name = '{}-audio-to-transcribe'.format(account_id)
    s3.create_bucket(Bucket=bucket_name)
    s3.put_object(
        Bucket=bucket_name, 
        Key=file_name, 
        Body=open('voices/{}'.format(file_name),'rb'), 
        ACL = 'public-read'
    )
    job_uri = '{}/{}/{}'.format(s3.meta.endpoint_url, bucket_name, file_name)
    
    response = transcribe.start_transcription_job(
        TranscriptionJobName=job_name,
        Media={'MediaFileUri': job_uri},
        MediaFormat='mp3',
        LanguageCode='en-US'
    )
    return response

# Initiate the transcription job
start_transcription_job(job_name)

# Wait until the transcription is done
while True:
    status = transcribe.get_transcription_job(TranscriptionJobName=job_name)
    if status['TranscriptionJob']['TranscriptionJobStatus'] in ['COMPLETED', 'FAILED']:
        break
    print("Not ready yet...")
    time.sleep(5)

print('Completed')


***Display the result***

In [None]:
# Get the URI of transcribed result
transcribe_result_uri = status['TranscriptionJob']['Transcript']['TranscriptFileUri']

# Store the result locally and open it for read
transcribe_result_storage_path = 'texts/photo_id_info.txt'
urllib.request.urlretrieve(transcribe_result_uri, transcribe_result_storage_path)
file = open(transcribe_result_storage_path, 'r')
transcript_result = file.read()
file.close()

# Display the transcription result
transcript_result_json = json.loads(transcript_result)['results']['transcripts'][0]['transcript']
print(transcript_result_json)

# Display the transcription result with confidence details
transcript_contents, transcript_confidence = ['WORD'], ['CONFIDENCE SCORE']
for item in json.loads(transcript_result)['results']['items']:
    transcript_contents.append(item['alternatives'][0]['content'])
    transcript_confidence.append(item['alternatives'][0]['confidence'])
transcript_table = [transcript_contents, transcript_confidence]
display(HTML(tabulate.tabulate(transcript_table, tablefmt='html')))

# Display the actual text for comparison purpose
print("\nActual:\n")
print(text_detect_result.replace('\n',' '))

# Delete the transcription job
delete_response = transcribe.delete_transcription_job(
   TranscriptionJobName=job_name
)

***Note that the mis-trascribed words (if any) are mostly having low confidence score***

### Step RE9: (Optional Challenge) Improve accuracy

***Find a way to get 0 word mis-transcription!***
Examples of things that we can try:    
1. Modify the input SSML text by inserting tags, etc    
2. Use Lexicons https://docs.aws.amazon.com/polly/latest/dg/managing-lexicons.html    
3. Increase polly syntehize transcribe rate  
4. Change file format from mp3 to other
5. Other creative ways....    

### Step RE10: Translate

***Now that we have the text transcribed back from audio, let's try to translate it***    
***We pick Bahasa Indonesia as the target language for this translation***

In [None]:
translate = boto3.client('translate')

translate_response = translate.translate_text(
    Text=transcript_result_json,
    SourceLanguageCode='en',
    TargetLanguageCode='id'
)
print(translate_response.get('TranslatedText'))


### Step RE11: Confirm dominant language

***With Amazon Comprehend, we can confirm that the dominant language is indeed Bahasa Indonesia***

In [None]:
detect_dominant_language_response = comprehend.detect_dominant_language(
    Text=translate_response.get('TranslatedText')
)
print(detect_dominant_language_response.get('Languages')[0])

### Step RE 12: Search by face

***Enough distractions. Let's go back to Amazon Rekognition for face searching***    

***Define the functions***

In [None]:
#Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-developer-guide/blob/master/LICENSE-SAMPLECODE.)

def create_collection(collectionName):
    maxResults=1
    collectionId=collectionName

    #Create a collection
    print('Creating collection:' + collectionId)
    response=rekognition.create_collection(CollectionId=collectionId)
    print('Collection ARN: ' + response['CollectionArn'])
    print('Status code: ' + str(response['StatusCode']))
    print('Done...')

def index_face(faceName, sourceFile, collectionId):

    response=rekognition.index_faces(CollectionId=collectionId,
                                Image={'Bytes': sourceFile.read()},
                                ExternalImageId=faceName,
                                MaxFaces=2,
                                QualityFilter="AUTO",
                                DetectionAttributes=['ALL'])

    print ('Results for ' + faceName)
    print('Faces indexed:')
    for faceRecord in response['FaceRecords']:
         print('  Face ID: ' + faceRecord['Face']['FaceId'])
         print('  Location: {}'.format(faceRecord['Face']['BoundingBox']))

    print('Faces not indexed:')
    for unindexedFace in response['UnindexedFaces']:
        print(' Location: {}'.format(unindexedFace['FaceDetail']['BoundingBox']))
        print(' Reasons:')
        for reason in unindexedFace['Reasons']:
            print('   ' + reason)
    return response

def search_faces_by_image(sourceFile, collectionId):
    
    threshold = 70
    maxFaces=1
  
    response=rekognition.search_faces_by_image(CollectionId=collectionId,
                                Image={'Bytes': sourceFile.read()},
                                FaceMatchThreshold=threshold,
                                MaxFaces=maxFaces)

                                
    faceMatches=response['FaceMatches']
    print ('Matching faces')
    
    if not faceMatches:
        print ('No match found')
    else:
        for match in faceMatches:
                print ('Match found with name ' + match['Face']['ExternalImageId'])
                print ('Similarity: ' + "{:.2f}".format(match['Similarity']) + "%")
                print
    return response

def delete_collection(collectionId):

    print('Attempting to delete collection ' + collectionId)
    statusCode=''
    try:
        response=rekognition.delete_collection(CollectionId=collectionId)
        statusCode=response['StatusCode']
        
    except ClientError as e:
        if e.response['Error']['Code'] == 'ResourceNotFoundException':
            print ('The collection ' + collectionId + ' was not found ')
        else:
            print ('Error other than Not Found occurred: ' + e.response['Error']['Message'])
        statusCode=e.response['ResponseMetadata']['HTTPStatusCode']
    print('Operation returned Status Code: ' + str(statusCode))
    print('Done...')

       

***Let's create our first face collection***

In [None]:
collectionID = 'skillful_people'
create_collection(collectionID)

***These are faces of skillful people that will be indexed***

**IMPORTANT :** ***Upload 2 PHOTOS of your face (preferably in different emotion) to folder 'images', in jpg format***    
How? On another tab, just open the same Jupyter Notebook page. Navigate to directory machine-learning-workshop/face-recognition-and-ai-services/images, and click on the Upload button.              

We will try to test the search-by-face function, whether it will identify us correctly

   **IMPORTANT :** ***Change the name and file name below as appropriate***

In [None]:
# This is the file path to 1 of our photos uploaded to 'images' folder
our_face_path = 'images/yudho.jpg' # CHANGE HERE as appropriate

# image source = https://football-tribe.com/indonesia/2018/03/13/evan-dimas-darmono-arek-suroboyo/
evan_dimas_url = 'https://football-tribe.com/indonesia/wp-content/uploads/sites/10/2018/03/Evan-dimas-800x449.jpg'

# image source = https://www.90min.com/posts/6260670-mesut-ozil-set-to-reject-loan-move-away-from-arsenal-to-fight-for-regular-starting-spot
mesut_ozil_url = 'https://images2.minutemediacdn.com/image/upload/c_fill,w_912,h_516,f_auto,q_auto,g_auto/shape/cover/sport/arsenal-v-qarabag-fk-uefa-europa-league-group-e-5c2b3eb8c7a324dcac000001.jpg'

# image source = https://www.imdb.com/name/nm1343894/
nicolas_anelka_url = 'https://m.media-amazon.com/images/M/MV5BNGUxMDcwOWYtNDBjMy00N2E1LWE4ZTItYmE1YmM1MDc0MWI3XkEyXkFqcGdeQXVyMjUyNDk2ODc@._V1_UX214_CR0,0,214,317_AL_.jpg'

# For Salah, the urls are already saved from previous steps

HTML(data='<figure style="float:left;"><img src="{}" alt="Me" width="120"/><figcaption ><center>Me</center></figcaption></figure><figure style="float:left;"><img src="{}" alt="Evan Dimas" width="250"/><figcaption><center>Evan Dimas</center></figcaption></figure><figure style="float:left;"><img src="{}" alt="Mesut Ozil" width="200"/><figcaption ><center>Mesut Ozil</center></figcaption></figure><figure style="float:left;"><img src="{}" alt="Nicolas Anelka" width="120"/><figcaption ><center>Nicolas Anelka</center></figcaption></figure><figure style="float:left;"><img src="{}" alt="Mohamed Salah" width="150"/><figcaption ><center>Mohamed Salah</center></figcaption></figure><figure style="float:left;"><img src="{}" alt="Mohamed Salah" width="150"/><figcaption ><center>Mohamed Salah</center></figcaption></figure>'.format(our_face_path, evan_dimas_url, mesut_ozil_url, nicolas_anelka_url, salah_a_url, salah_b_url))


***Next, we need to index our face, together with other skillful people's faces to the collection***

In [None]:
# Index Evan Dimas' face to our collection
evan_dimas = urllib.request.urlopen(evan_dimas_url)
response = index_face('evan_dimas',evan_dimas,collectionID)

# Index our face to our collection
our_face = open(our_face_path,'rb')
response = index_face('yudho',our_face,collectionID) # Change the first argument (the index name) as appropriate

# Index Mesut Ozil's face to our collection
mesut_ozil = urllib.request.urlopen(mesut_ozil_url)
response = index_face('mesut_ozil',mesut_ozil,collectionID)

# Index Nicolas Anelka's face to our collection
nicolas_anelka = urllib.request.urlopen(nicolas_anelka_url)
response = index_face('nicolas_anelka',nicolas_anelka,collectionID)

# Index Mohamed Salah's face to our collection
salah_a = urllib.request.urlopen(salah_a_url)
response = index_face('mohamed_salah',salah_a,collectionID)

# Index another Mohamed Salah's face to our collection
salah_b = urllib.request.urlopen(salah_b_url)
response = index_face('mohamed_salah',salah_b,collectionID)


**IMPORTANT:** ***Change the file path below to ANOTHER photo of yours uploaded***

***Let's do a search by image: Who is this?***

In [None]:
# This is the file path to ANOTHER photo of yours in 'images' folder
sourceFile=open('images/yudho.jpg','rb') # CHANGE HERE as appropriate

response = search_faces_by_image(sourceFile, collectionID)

***Test with another person***

In [None]:
# image source = https://bola.republika.co.id/berita/sepakbola/liga-indonesia/17/11/14/ozf0ba438-jika-evan-dimas-ke-selangor-ia-akan-ikuti-jejak-4-pemain-ini
evan_dimas_b_url = 'https://s.republika.co.id/uploads/images/inpicture_slide/evan-dimas-timnas-u19-_140206135120-764.jpg'

HTML(data='<figure style="float:left;"><img src="{}" alt="Evan Dimas" width="400"/><figcaption ><center>Evan Dimas</center></figcaption></figure>'.format(evan_dimas_b_url))

***Alright, so Rekognition, who is this person?***

In [None]:
sourceFile = urllib.request.urlopen(evan_dimas_b_url)
response = search_faces_by_image(sourceFile, collectionID)

***Clean up time***

In [None]:
delete_collection(collectionID)