Our project will focus on the AWS ML service **Amazon Transcribe**, which converts speech to text. 

<img src="https://github.com/cindyfangw/QTM350/blob/0392cc65f9276917da58ceb44ecd4dbdea4cdc89/AmazonTranscribe.png?raw=true"
     width="300" height="250">

Amazon Transcribe makes it easy for developers to add speech to text capabilities to their applications. Audio data is virtually impossible for computers to search and analyze. Therefore, recorded speech needs to be converted to text before it can be used in applications. Historically, customers had to work with transcription providers that required them to sign expensive contracts and were hard to integrate into their technology stacks to accomplish this task. Many of these providers use outdated technology that does not adapt well to different scenarios, like low-fidelity phone audio common in contact centers, which results in poor accuracy.<br>

Amazon Transcribe uses a deep learning process called automatic speech recognition (ASR) to convert speech to text quickly and accurately. Amazon Transcribe can be used to transcribe customer service calls, automate subtitling, and generate metadata for media assets to create a fully searchable archive. You can use Amazon Transcribe Medical to add medical speech to text capabilities to clinical documentation applications.

<img src="https://github.com/cindyfangw/QTM350/blob/0392cc65f9276917da58ceb44ecd4dbdea4cdc89/AmazonTranscribeFeatures.png?raw=true"
     width="600" height="250">

Amazon Transcribe has many benefits: <br>
1. Unlock the value of audio and video content
2. Transform customer experiences
3. Save time & money with accurate transcripts
4. Ensure customer privacy & safety

**So let's get started! We are going to use the AWS CLI method this time!**

We first need to create a new S3 bucket via the AWS console. <br>

In addition, we also need to attach a policy to current SageMaker Instance to allow Amazon Transcribe have full access:
1. Go to IAM(Identity and Access Management)and go to Policies
2. Search and Select "AmazonTranscribeFullAccess"
3. Click Actions on the Right Corner and choose Attach

### Set up the AWS CLI (Command Line Interface)

The AWS CLI is preinstalled in any new Sagemaker instance, so we just need to run the following codes. It shows a list of commands that are available for working with S3 using command line

In [50]:
!aws s3 help

S3()                                                                      S3()



[1mNAME[0m
       s3 -

[1mDESCRIPTION[0m
       This  section  explains  prominent concepts and notations in the set of
       high-level S3 commands provided.

   [1mPath Argument Type[0m
       Whenever using a command, at least one path argument must be specified.
       There are two types of path arguments: [1mLocalPath [22mand [1mS3Uri[22m.

       [1mLocalPath[22m: represents the path of a local file or directory.  It can be
       written as an absolute path or relative path.

       [1mS3Uri[22m: represents the location of a S3 object, prefix, or bucket.  This
       must  be  written in the form [1ms3://mybucket/mykey [22mwhere [1mmybucket [22mis the
       specified S3 bucket, [1mmykey [22mis the specified S3 key.  The path  argument
       must  begin with [1ms3:// [22min order to denote that the path argument refers
       to a S3 object. Note that

We already created a new bucket via AWS console named "qtm350finalproject". For more instructions, please see  __[AWS User Guide - Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html)__.

Then, we check the bucket list in our account by using s3 ls command

In [40]:
! aws s3 ls

2021-11-01 22:22:00 hw0102030405
2021-11-03 02:32:28 hw3q5group2
2021-11-10 02:48:00 qtm350finalproject


### Set Up Amazon Transcribe

In [1]:
!aws transcribe help

TRANSCRIBE()                                                      TRANSCRIBE()



[1mNAME[0m
       transcribe -

[1mDESCRIPTION[0m
       Operations and objects for transcribing speech to text.

[1mAVAILABLE COMMANDS[0m
       +o create-call-analytics-category

       +o create-language-model

       +o create-medical-vocabulary

       +o create-vocabulary

       +o create-vocabulary-filter

       +o delete-call-analytics-category

       +o delete-call-analytics-job

       +o delete-language-model

       +o delete-medical-transcription-job

       +o delete-medical-vocabulary

       +o delete-transcription-job

       +o delete-vocabulary

       +o delete-vocabulary-filter

       +o describe-language-model

       +o get-call-analytics-category

       +o get-call-analytics-job

       +o get-medical-transcription-job

       +o get-medical-vocabulary

       +o get-transcription-job

       +o get-voc

We first upload all the voice recording collected from our group mates and the text file of the sample to our bucket(qtm350finalproject). <br>

We are going to use the ML feature: start-transcription-job.

In [3]:
!ls

Alexa-aaron-Text.json  Alexa-Naye-Text.json	    TranscribeSample-cindy.json
Alexa-annie-Text.json  AlexaSample.txt		    TranscribeSample.json
Alexa-cindy-Text.json  qtm350 recording — Naye.mp3  TranscribeSample-kd.json
Alexa-kd-Text.json     Transcribe.ipynb		    TranscribeSample-mary.json
Alexa-mary-Text.json   TranscribeSample-aaron.json  TranscribeSample-Polly.json
Alexa-Mary-Text.json   TranscribeSample-annie.json


In [2]:
!aws transcribe start-transcription-job help

START-TRANSCRIPTION-JOB()                            START-TRANSCRIPTION-JOB()



[1mNAME[0m
       start-transcription-job -

[1mDESCRIPTION[0m
       Starts an asynchronous job to transcribe speech to text.

       See also: AWS API Documentation

       See 'aws help' for descriptions of global parameters.

[1mSYNOPSIS[0m
            start-transcription-job
          --transcription-job-name <value>
          [--language-code <value>]
          [--media-sample-rate-hertz <value>]
          [--media-format <value>]
          --media <value>
          [--output-bucket-name <value>]
          [--output-key <value>]
          [--output-encryption-kms-key-id <value>]
          [--kms-encryption-context <value>]
          [--settings <value>]
          [--model-settings <value>]
          [--job-execution-settings <value>]
          [--content-redaction <value>]
          [--identify-language | --no-identify-language]
          [--language-options <value>]
          [--subtitles <va

In [7]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [9]:
!aws s3 cp help

CP()                                                                      CP()



[1mNAME[0m
       cp -

[1mDESCRIPTION[0m
       Copies a local file or S3 object to another location locally or in S3.

       See 'aws help' for descriptions of global parameters.

[1mSYNOPSIS[0m
            cp
          <LocalPath> <S3Uri> or <S3Uri> <LocalPath> or <S3Uri> <S3Uri>
          [--dryrun]
          [--quiet]
          [--include <value>]
          [--exclude <value>]
          [--acl <value>]
          [--follow-symlinks | --no-follow-symlinks]
          [--no-guess-mime-type]
          [--sse <value>]
          [--sse-c <value>]
          [--sse-c-key <value>]
          [--sse-kms-key-id <value>]
          [--sse-c-copy-source <value>]
          [--sse-c-copy-source-key <value>]
          [--storage-class <value>]
          [--grants <value> [<value>...]]
          [--website-redirect <value>]
          [--content-type <value>]
          [--cache-cont

After the transcription, Amazon Transcribe will yield a document called "Alexa-[name]-Text.json" and we downloaded this file.

In [10]:
!aws s3 cp s3://qtm350finalproject/Alexa-Naye-Text.json Alexa-Naye-Text.json

download: s3://qtm350finalproject/Alexa-Naye-Text.json to ./Alexa-Naye-Text.json


In [11]:
import json
import pandas as pd

We extract the transcription text from the json file.

In [12]:
with open("Alexa-Naye-Text.json", "r") as f:
    array = json.load(f)

In [13]:
transcription = array["results"]["transcripts"][0]["transcript"]

In [14]:
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [15]:
transcription

"Yeah, Alexa play Wake me up when september ends Alex. That volume up Alexa. I'm boring. So tell me a joke Alexa. Is it going to ring today Alexa? What is today's date Alexa? What bus should I take to get to Emory University Alexa? I broke up with my significant other and I'm really sad. What should I do Alexa? I want to buy a birthday gift for my best friend. Do you have any suggestions?"

In [16]:
text

'Alexa play "Wake me up when September Ends". Alexa Volume Up. Alexa I\'m boring so tell me a joke. Alexa is it going to rain today? Alexa what is today\'s date? Alexa what bus should I take to get to Emory University? Alexa I broke up with my significant other and I\'m really sad. What should I do? Alexa I want to buy a birthday gift for my best friend, do you have any suggestions?'

We use the same above steps to transcript all the recordings from our group memebers and extract the transcription.

In [17]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-aaron.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [18]:
!aws s3 cp s3://qtm350finalproject/Alexa-aaron-Text.json Alexa-aaron-Text.json

download: s3://qtm350finalproject/Alexa-aaron-Text.json to ./Alexa-aaron-Text.json


In [19]:
with open("Alexa-aaron-Text.json", "r") as f:
    array = json.load(f)
transcription_aaron = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [20]:
transcription_aaron

"Alexa Play Wake me up when september ends Alexa volume up Alexa. I'm boring. So tell me a joke Alexa, Is it going to rain today Alexa? What is today? State Alexa? What bus should I take to get to Emory University Alexa? I broke up with my significant other and I'm really sad. What should I do Alexa? I want to buy a birthday gift for my best friend. Do you have any suggestions?"

In [21]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-annie.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [22]:
!aws s3 cp s3://qtm350finalproject/Alexa-annie-Text.json Alexa-annie-Text.json

download: s3://qtm350finalproject/Alexa-annie-Text.json to ./Alexa-annie-Text.json


In [23]:
with open("Alexa-annie-Text.json", "r") as f:
    array = json.load(f)
transcription_annie = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [24]:
transcription_annie

"Alexa play Wake me up when september ends Alexa volume up Alexa. I'm so boring. So tell me a joke Alexa, Is it going to rain today Alexa? What is today's Day Alexa? What bus should I take to get to Emory University? Mhm Alexa. I broke up with my significant other and I'm really sad. What should I do Alexa? I want to buy a birthday gift for my best friend. Do you have any suggestions?"

In [25]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-cindy.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [26]:
!aws s3 cp s3://qtm350finalproject/Alexa-cindy-Text.json Alexa-cindy-Text.json

download: s3://qtm350finalproject/Alexa-cindy-Text.json to ./Alexa-cindy-Text.json


In [27]:
with open("Alexa-cindy-Text.json", "r") as f:
    array = json.load(f)
transcription_cindy = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [28]:
transcription_cindy

"Alexa Play Wake me up when september ends Alexa volume up Alexa. I'm boring. So tell me a joke Alexa, Is it going to rain today Alexa? What is today's date Alexa? What bus should I take to get to Emory University Alexa? I broke up with my significant other and I'm really sad. What should I do? Yeah, Alexa. I want to buy a birthday gift for my best friend. Do you have any suggestions?"

In [29]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-kd.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [30]:
!aws s3 cp s3://qtm350finalproject/Alexa-kd-Text.json Alexa-kd-Text.json

download: s3://qtm350finalproject/Alexa-kd-Text.json to ./Alexa-kd-Text.json


In [31]:
with open("Alexa-kd-Text.json", "r") as f:
    array = json.load(f)
transcription_kd = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [32]:
transcription_kd

"Alexa Play Wake me up when september ends Alexa volume up Alexa. I'm boring. So tell me a joke Alexa, Is it going to rain today? Mhm Alexa. What is today's date? Alexa? What bus should I take to get to Emory University Alexa? I broke up with my significant other and I'm really sad. What should I do? Mhm. Alexa. I want to buy a birthday gift for my best friend. Do you have any suggestions?"

In [33]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-mary.json


An error occurred (ConflictException) when calling the StartTranscriptionJob operation: The requested job name already exists. Use a different job name.


In [34]:
!aws s3 cp s3://qtm350finalproject/Alexa-Mary-Text.json Alexa-Mary-Text.json

download: s3://qtm350finalproject/Alexa-Mary-Text.json to ./Alexa-Mary-Text.json


In [35]:
with open("Alexa-Mary-Text.json", "r") as f:
    array = json.load(f)
transcription_mary = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [36]:
transcription_mary

"Alexa Play Wake me up when september ends Alexa volume up Alexa. I'm boring. So tell me a joke Alexa, Is it going to rain today Alexa? What is today's date Alexa? What bus should I take to get to Emory University Alexa? I broke up with my significant other and I am really sad. What should I do Alexa? I want to buy a birthday gift for my friend. Do you have any suggestions?"

We investigated the ML service Amazon Polly in our HW3 and this is a ML service that can generate speech from a text file. Therefore, we are curious about how the audio recording created Amazon Polly will perform in the Amazon Transcribe Service. <br>

Therefore, we first generated the audio file from Amazon Polly service.

In [41]:
!aws polly start-speech-synthesis-task \
    --output-format mp3 \
    --output-s3-bucket-name qtm350finalproject \
    --text  file://AlexaSample.txt \
    --voice-id Joanna

{
    "SynthesisTask": {
        "TaskId": "963d55e3-30c7-4dc4-86a3-caf0a1cd6649",
        "TaskStatus": "scheduled",
        "OutputUri": "https://s3.us-east-1.amazonaws.com/qtm350finalproject/963d55e3-30c7-4dc4-86a3-caf0a1cd6649.mp3",
        "CreationTime": 1636523199.865,
        "RequestCharacters": 381,
        "OutputFormat": "mp3",
        "TextType": "text",
        "VoiceId": "Joanna"
    }
}


Then we transcribe the audio file into a text file by using Amazon Transcribe (use the same steps as above)

In [44]:
!aws transcribe start-transcription-job \
    --cli-input-json file://TranscribeSample-Polly.json

{
    "TranscriptionJob": {
        "TranscriptionJobName": "Alexa-Polly-Text",
        "TranscriptionJobStatus": "IN_PROGRESS",
        "LanguageCode": "en-US",
        "Media": {
            "MediaFileUri": "s3://qtm350finalproject/qtm350 recording — Polly.mp3"
        },
        "StartTime": 1636523681.223,
        "CreationTime": 1636523681.199
    }
}


In [45]:
!aws s3 cp s3://qtm350finalproject/Alexa-Polly-Text.json Alexa-Polly-Text.json

download: s3://qtm350finalproject/Alexa-Polly-Text.json to ./Alexa-Polly-Text.json


In [46]:
with open("Alexa-Polly-Text.json", "r") as f:
    array = json.load(f)
transcription_Polly = array["results"]["transcripts"][0]["transcript"]
with open("AlexaSample.txt", "r") as f1:
    text = f1.read()

In [47]:
transcription_Polly

"Alexa Play Wake me up when september ends Alexa, volume up Alexa. I'm boring. So tell me a joke Alexa, Is it going to rain today Alexa? What is today's date Alexa? What should I take to get to Emory University Alexa? I broke up with my significant other and I'm really sad. What should I do Alexa? I want to buy a birthday gift for my best friend. Do you have any suggestions?"