# Language/Voice services on AWS: AI Polly, Translate, Comprehend, Transcribe

In this demo we will take you through a couple of AWS AI services that deal with language.

We will run through the following cases.

![title](AWS-language-service-lab-cases.png)

We will start off by installing and importing the modules we will need

In [129]:
!pip install colorhash

[33mYou are using pip version 10.0.1, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [130]:
import boto3
#import matplotlib.pyplot as plt
#%matplotlib inline
import IPython
from IPython.display import JSON
import json
import time, datetime
from IPython.display import display, HTML
from IPython.core.display import display, HTML
import IPython.display as ipd
from colorhash import ColorHash
from IPython.display import JSON

### here is a sample of text which we will use for our demo

You can replace the text in the quotes with your own in you like.

In [131]:

textSample = """
    
    Nordic and Baltic stock markets were halted by technical problems for a second time on Friday, only minutes after trading resumed following earlier problems.

    The equity and equity derivatives markets had reopened at 12:00 GMT after a two-hour trading halt that Nasdaq attributed to connectivity issues.

    “Due to technical disturbances, Nasdaq Nordic Equity and Nasdaq Nordic Index and Equity Derivatives markets (have been) halted again,” operator Nasdaq said in an emailed statement.

    The company operates bourses in Finland, Denmark, Sweden, Iceland, Estonia, Latvia and Lithuania.

    Nasdaq said after the second stoppage that the Stockholm bourse, which had already been scheduled to close early ahead of a public holiday, and the Baltic exchanges would remain closed for the rest of the day.

    Trading in Copenhagen, Helsinki and Reykjavik will resume with opening auctions at 15:05 GMT, it said, followed by continuous trading from 1615 CET (1515 GMT).

    Those bourses are scheduled to close as per normal trading hours.

    “We are working diligently to correct the technical issue we experienced with our Nordic and Baltic markets today,” Lauri Rosendahl, head of European equities and derivatives at Nasdaq, said in an email.

    “We apologize for the inconvenience this has created for our exchange members and all investors in our markets.”

    Trading on the Norwegian bourse, which is owned by Euronext, was not interrupted. NASDAQ and London Stock Exchange remained unchanged.
    

    """


# Polly example

We will use this method: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/polly.html#Polly.Client.synthesize_speech
        
Note that there are a lot of options on Polly. You can change the voice, the engine (we use the neural engine which gives more natural speech) and the language. You can also do some text markups.

Note that the synthesize speech method generates an audio stream.

In [132]:
def synthesizeText(text, engine="neural", languageCode="en-US", VoiceId = "Joanna"):
    client = boto3.client('polly')
    response = client.synthesize_speech(
        Engine=engine,
        LanguageCode=languageCode,
        OutputFormat='mp3',
        #SpeechMarkTypes=['ssml', 'sentence', 'viseme', 'word'],
        Text=text,
        VoiceId=VoiceId
    )
    return response


### Now we actually invoke Polly with our text sample

In [133]:
pollyresults = synthesizeText(textSample)


In [134]:
print(pollyresults)
#JSON(pollyresults)

{'ResponseMetadata': {'RequestId': '54fd9be8-090f-44f9-8a75-e3bc3aea77ba', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '54fd9be8-090f-44f9-8a75-e3bc3aea77ba', 'x-amzn-requestcharacters': '1469', 'content-type': 'audio/mpeg', 'transfer-encoding': 'chunked', 'date': 'Thu, 23 Jan 2020 14:49:09 GMT'}, 'RetryAttempts': 0}, 'ContentType': 'audio/mpeg', 'RequestCharacters': '1469', 'AudioStream': <botocore.response.StreamingBody object at 0x7f6ef6dcada0>}


#### Next we save the audio stream to a file so that we can play it back

In [135]:
file = open('pollyspeech.mp3', 'wb')
file.write(pollyresults['AudioStream'].read())
file.close()

#### You can now play back the audio. 

Turn up your audio! Press the player to hear the output from Polly.

In [136]:
#!ls
ipd.Audio('pollyspeech.mp3')

### here are some Polly use cases

![title](Polly-contact-center.png)
![title](Polly-content-creation.png)
![title](Polly-elearning.png)


# Amazon Translate

We are now going to translate the sample text into a different language.

We will be using the following method: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/translate.html#Translate.Client.translate_text
        
### Translate implemenation (***)

In [137]:
def translate(text, sourcelang="en", targetlang="fr"):
    client = boto3.client('translate')
    response = client.translate_text(
        Text=text,
        SourceLanguageCode=sourcelang,
        TargetLanguageCode=targetlang
    )
    return response

In [138]:
translated = translate(textSample)
JSON(translated)

<IPython.core.display.JSON object>

### Translate use cases

![title](Translate-use-cases.png)

# Polly output in French

In this example we let Polly speak in French

In [139]:
pollyresults = synthesizeText(translated["TranslatedText"], engine="standard", languageCode="fr-CA", VoiceId = "Chantal")

In [140]:
print(pollyresults)

{'ResponseMetadata': {'RequestId': '3896400c-b738-40c2-a5de-0dd59b652bda', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '3896400c-b738-40c2-a5de-0dd59b652bda', 'x-amzn-requestcharacters': '1738', 'content-type': 'audio/mpeg', 'transfer-encoding': 'chunked', 'date': 'Thu, 23 Jan 2020 14:49:14 GMT'}, 'RetryAttempts': 0}, 'ContentType': 'audio/mpeg', 'RequestCharacters': '1738', 'AudioStream': <botocore.response.StreamingBody object at 0x7f6ef6e1b588>}


### we save the audio stream to a file so that we can play it

In [141]:
file = open('pollyspeechv2.mp3', 'wb')
file.write(pollyresults['AudioStream'].read())
file.close()

### you can now listen to the audio

In [142]:
ipd.Audio('pollyspeechv2.mp3')

# Comprehend example. 

We will be using a few different apis in Comprehend.

For details see: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html#client

![title](Comprehend-how-it-works.png)
![title](Comprehend-medical-how-it-works.png)


     

### First we do entity detection

The document is over here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html#Comprehend.Client.detect_entities

We have to pass in the language as a parameter, so first we detect the dominant language

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html#Comprehend.Client.detect_dominant_language


In [143]:
def detect_dominant_language(text):
    client = boto3.client('comprehend')
    response = client.detect_dominant_language(
        Text=text
    )    
    language = response["Languages"][0]["LanguageCode"]
    print("language: ", language )
    #print("languages: ", response)
    return language


def detect_entities(text):
    client = boto3.client('comprehend')
    language = detect_dominant_language(text)
    response = client.detect_entities(Text='string', LanguageCode=language)
    #print(json.dumps(response, indent=2))
    return response
    #print(response)

In [144]:
entities = detect_entities(textSample)
JSON(entities)
#in the example tested there were no entities

language:  en


<IPython.core.display.JSON object>

### Next we detect keyphrases

We use this function help us color code keyphrases for visualization.



In [145]:
def displayKeyPhrases(keyPhrases):
    colors = ["#d16ba5", "#c777b9", "#ba83ca", "#aa8fd8", "#9a9ae1", "#8aa7ec", "#79b3f4", "#69bff8", "#52cffe", "#41dfff", "#46eefa", "#5ffbf1"]
    counter = 0
    thisHTML = ""
    for keyPhrase in keyPhrases:
        thisHTML += '<span style="background-color: ' + colors[counter] + ' ">' + keyPhrase["Text"] + '</span> '
        counter += 1
        if counter >= len(colors):
            counter = 0
    display(HTML(thisHTML))
                     
def detect_key_phrases(text):
    client = boto3.client('comprehend') 
    language = detect_dominant_language(text)
    response = client.detect_key_phrases(
            Text=text,
            LanguageCode=language
        )    
    #print(json.dumps(response, indent=2))
    return response    
      

In [146]:
keyphrases = detect_key_phrases(textSample)
JSON(keyphrases)

language:  en


<IPython.core.display.JSON object>

In [147]:
displayKeyPhrases(keyphrases["KeyPhrases"])

### Next we do sentiment detection

the docs are here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html#Comprehend.Client.detect_sentiment

In [148]:
def detect_sentiment(text):
    client = boto3.client('comprehend') 
    language = detect_dominant_language(text)
    response = client.detect_sentiment(
            Text=text,
            LanguageCode=language
        )    
    #print(json.dumps(response, indent=2))
    return response    

In [149]:
sentiment = detect_sentiment(textSample)
JSON(sentiment)

language:  en


<IPython.core.display.JSON object>

### Next we will do syntax analysis

The API specs are here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/comprehend.html#Comprehend.Client.detect_syntax
        

In [150]:
def displaySyntax(syntaxTokens):
    thisHTML = "<h3>Syntax</h3><br>"
    tagHTML = "<h3>Syntax Legend </h3>"
    allTags = []
    for syntaxToken in syntaxTokens:
        c = ColorHash(syntaxToken["PartOfSpeech"]["Tag"])
        allTags.append(syntaxToken["PartOfSpeech"]["Tag"])
        thisHTML += '<span style="background-color: ' + str(c.hex) + ' ">' + syntaxToken["Text"] + '</span> '
    allTags = list(set(allTags))
    for tag in allTags:
        c = ColorHash(tag)
        tagHTML += '<span style="background-color: ' + str(c.hex) + ' ">' + tag + '</span> ' 
    display(HTML(tagHTML))
    display(HTML(thisHTML))
    
def detect_syntax(text):
    client = boto3.client('comprehend') 
    language = detect_dominant_language(text)
    response = client.detect_syntax(
            Text=text,
            LanguageCode=language
        )    
    #print(json.dumps(response, indent=2))
    return response                      


In [151]:
syntax = detect_syntax(textSample)

language:  en


In [152]:
displaySyntax(syntax["SyntaxTokens"])

### Comprehend use cases

![title](Comprehend-classify-tickets-for-better-handling.png)
![title](Comprehend-use-case-knowledge-management-discovery.png)
![title](Comprehend-use-case-search.png)
![title](Comprehend-classify-tickets-for-better-handling.png)
![title](Comprehend-use-case-voice-of-customer.png)



# Amazon Transcribe example

We are now going to run through a Transcribe example. But first let us create buckets (if they don't exist) and set things up for this part of the lab

In [153]:
def createBucket(bucketname):
    s3 = boto3.client('s3')
    response = s3.list_buckets()
    existingbuckets = [d['Name'] for d in response["Buckets"]]
    #print(existingbuckets)
    if bucketname not in existingbuckets:
        print("creating bucket " + bucketname)
        s3.create_bucket(Bucket=bucketname)
    else:
        print("bucket exists! " + bucketname)
    return bucketname

In [154]:
accountid = boto3.client('sts').get_caller_identity().get('Account')
bucketname = "aimlbootcamp" + accountid
createBucket(bucketname)

bucket exists! aimlbootcamp247322960887


'aimlbootcamp247322960887'

In [155]:
def upload_file(file_name, bucket, object_name=None):

    if object_name is None:
        object_name = file_name

    s3_client = boto3.client('s3')
    response = s3_client.upload_file(file_name, bucket, object_name)
    mediafileUri = "https://" + bucket + ".s3.amazonaws.com/" + object_name
    return mediafileUri


In [156]:
mediafileUri = upload_file("pollyspeech.mp3", bucketname, object_name=None) #use the file that was created earlier
print(mediafileUri)

https://aimlbootcamp247322960887.s3.amazonaws.com/pollyspeech.mp3


### This function pulls the transcribed text when the Transcribe job is done

In [157]:
def getTranscriptTextFromS3(bucket, path):
    client = boto3.client('s3')
    #https://s3.amazonaws.com/aimlbootcamp485483564801/aws-aimlbootcamp-1572965192.json
    s3Bucket = bucket
    s3ObjectKey = path.split(bucket+"/")[1]
    print("reading results: ", s3Bucket, " : ", s3ObjectKey)
    response = client.get_object(
        Bucket=s3Bucket,
        Key=s3ObjectKey
        )
    s3Body = response["Body"].read().decode('utf-8')
    json_content = json.loads(s3Body)
    transcriptText = json_content["results"]["transcripts"][0]["transcript"]
    return transcriptText

### This part actually kicks off the transcription and monitors the job

The docs are here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job
        
You will notice that we loop to monitor the job for completion. We do this in this demo to fit in a notebook, but in real implementations you can trigger subsequent steps when the transcribed text arrives in the S3 destination (S3 trigger). Your real-life implementation will have less code!!!       


In [158]:
def doTranscription(mediafileUri, outputbucket, languagecode='en-US', MediaFormat='mp3'):
    client = boto3.client('transcribe')
    jobname = 'aws-aimlbootcamp-' + str(int(time.time()))
    response = client.start_transcription_job(
        TranscriptionJobName=jobname,
        LanguageCode='en-US',
        MediaFormat='mp3',
        Media={
            'MediaFileUri': mediafileUri
        },
        OutputBucketName=outputbucket,
        #OutputEncryptionKMSKeyId='string',
        Settings={
            #'VocabularyName': 'string',
            'ShowSpeakerLabels': True,
            'MaxSpeakerLabels': 10
            #'ChannelIdentification': True
        }
    )
    #return response["TranscriptionJob"]["TranscriptionJobName"]
    
    max_time = time.time() + 10*60 # 10 minutes
    transcribedResults = ""
    while time.time() < max_time:
        jobstatusresponse = client.get_transcription_job(
            TranscriptionJobName=jobname
        )
        status = jobstatusresponse["TranscriptionJob"]["TranscriptionJobStatus"]
        print(datetime.datetime.now(), " Job Status: {}".format(status), "             ", end='\r')

        if status == "COMPLETED" or status == "FAILED":
            #print(jobstatusresponse)
            resultpath = jobstatusresponse["TranscriptionJob"]["Transcript"]["TranscriptFileUri"]
            transcribedResults = getTranscriptTextFromS3(outputbucket, resultpath)
            break

        time.sleep(4.3562)
        
    return jobname, transcribedResults


### We played this audio before, but for reference this is what we are going to transcribe

In [159]:
ipd.Audio('pollyspeech.mp3')

### Here we print back the text that got transcribed

In [160]:
transcribejob, transcriberesults = doTranscription(mediafileUri, bucketname, languagecode='en-US', MediaFormat='mp3')
#print(transcribejob)
print(transcriberesults)



reading results:  aimlbootcamp247322960887  :  aws-aimlbootcamp-1579790956.json
Nordic and Baltic stock markets were halted by technical problems for a second time on Friday, only minutes after trading resumed following earlier problems. Equity and equity derivatives markets had reopened a 12 o'clock GMT after a two hour trading halt that NASDAQ attributed to connectivity issues due to technical disturbances. NASDAQ, Nordic Equity and NASDAQ Nordic Index and equity derivatives markets have been halted again, operator NASDAQ said in an e mailed statement. The company operates forces in Finland, Denmark, Sweden, Iceland, Estonia, Latvia and Lithuania. NASDAQ said after the second stoppage that the Stockholm force, which had already been scheduled to close early ahead of a public holiday, and the Baltic exchanges would remain closed for the rest of the day. Trading in Copenhagen, Helsinki in Reykjavik will resume with opening auctions at 15 05 GMT, it said, followed by continuous trading 

### Transcribe use cases
![title](Transcribe-use-cases.png)