1. [Install Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#install-boto3)

Install the latest Boto3 release via pip:

In [None]:
!pip3 install boto3

2. [Configuration using Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#using-boto3)

You need: 
- aws_access_key_id = YOUR_ACCESS_KEY
- aws_secret_access_key = YOUR_SECRET_KEY
- region=YOUR_REGION

In [26]:
import boto3
sesion = boto3.Session(region_name="YOUR_REGION",
aws_access_key_id="YOUR_ACCESS_KEY",
aws_secret_access_key="YOUR_SECRET_KEY")


3. [Amazon Polly Client.](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/polly.html) 

The Amazon Polly service provides API operations for synthesizing high-quality speech from plain text and Speech Synthesis Markup Language (SSML), along with managing pronunciations lexicons that enable you to get the best results for your application domain.

In [7]:
polly_client = sesion.client('polly')

4. [Start Speech Syntesis Task API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/polly/client/start_speech_synthesis_task.html)

Allows the creation of an asynchronous synthesis task, by starting a new SpeechSynthesisTask. This operation requires all the standard information needed for speech synthesis, plus the name of an Amazon S3 bucket for the service to store the output of the synthesis task and two optional parameters ( OutputS3KeyPrefix and SnsTopicArn). Once the synthesis task is created, this operation will return a SpeechSynthesisTask object, which will include an identifier of this task as well as the current status. The SpeechSynthesisTask object is available for 72 hours after starting the asynchronous synthesis task.

In [6]:
#defines the parameters for the synthesize_speech_to_text function
TEXT = 'This is a sample text to be synthesized.'
OUT_PUT_S3_BUCKET_NAME = "your-bucket"
OUT_PUT_S3_KEY_PREFIX = "your-prefix"
OUT_PUT_FORMAT = "mp3" #'json'|'mp3'|'ogg_vorbis'|'pcm'
ENGINE = 'neural' #standard'|'neural' 

### Choose the VoiceId


[Neural Voices](https://docs.aws.amazon.com/polly/latest/dg/ntts-voices-main.html)

In [21]:
response = polly_client.start_speech_synthesis_task(
                VoiceId='Joanna',
                OutputS3BucketName = OUT_PUT_S3_BUCKET_NAME,
                OutputS3KeyPrefix = OUT_PUT_S3_KEY_PREFIX,
                OutputFormat= OUT_PUT_FORMAT, 
                Text= TEXT,
                Engine= ENGINE)

taskId = response['SynthesisTask']['TaskId']

print( "Task id is {} ".format(taskId))

Task id is 54aa327a-5f2a-4d71-9d22-3dcb94416710 


4. [Get speech synthesis task.](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/polly/client/get_speech_synthesis_task.html) 

Retrieves a specific SpeechSynthesisTask object based on its TaskID. This object contains information about the given speech synthesis task, including the status of the task, and a link to the S3 bucket containing the output of the task.

In [10]:
import time


In [None]:
max_time = time.time() + 2 # 3 hours
while time.time() < max_time:
    response_task = polly_client.get_speech_synthesis_task(
    TaskId=taskId
    )
    
    status = response_task['SynthesisTask']['TaskStatus']
    print("Polly SynthesisTask: {}".format(status))
    
    if status == "completed" or status == "failed":
        if status == "failed": 
            reason = response_task['SynthesisTask']['TaskStatusReason']
            print("TaskStatusReason: {}".format(reason))
        else:
            outPutUri= response_task['SynthesisTask']['OutputUri']
            print("OutputUri: {}".format(outPutUri))
        break
        
    time.sleep(2)

### Download the file from the S3 bucket

In [None]:
bucket_file = OUT_PUT_S3_KEY_PREFIX + outPutUri.split("/")[-1]
local_file = 'speech_local.mp3'

In [None]:
s3 = boto3.resource('s3')
s3.Bucket(OUT_PUT_S3_BUCKET_NAME).download_file(bucket_file, local_file)

4. [Synthesize speech.](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/polly/client/synthesize_speech.html)

Synthesizes UTF-8 input, plain text or SSML, to a stream of bytes. SSML input must be valid, well-formed SSML. Some alphabets might not be available with all the voices (for example, Cyrillic might not be read at all by English voices) unless phoneme mapping is used

In [25]:
response = polly_client.synthesize_speech(
                VoiceId='Joanna',
                OutputFormat=OUT_PUT_FORMAT, 
                Text = TEXT,
                Engine = ENGINE)

file = open('speech.mp3', 'wb')
file.write(response['AudioStream'].read())
file.close()
