## Amazon Polly

In following examples, I will show how by using boto3, we can interact with Polly service.

We need to start with creating Polly client:

In [21]:
import boto3
import IPython.display as ipd
from pprint import pprint
from contextlib import closing

session = boto3.session.Session()
polly_client = session.client('polly')

### Describe voices

In our first example, we will take a look at method called `describe_voices`, which returns all available voices that we can use to synthesize speech.

In [11]:
response = polly_client.describe_voices()

print(len(response['Voices']))

pprint(response['Voices'])

52
[{'Gender': 'Female',
  'Id': 'Filiz',
  'LanguageCode': 'tr-TR',
  'LanguageName': 'Turkish',
  'Name': 'Filiz'},
 {'Gender': 'Female',
  'Id': 'Astrid',
  'LanguageCode': 'sv-SE',
  'LanguageName': 'Swedish',
  'Name': 'Astrid'},
 {'Gender': 'Female',
  'Id': 'Tatyana',
  'LanguageCode': 'ru-RU',
  'LanguageName': 'Russian',
  'Name': 'Tatyana'},
 {'Gender': 'Male',
  'Id': 'Maxim',
  'LanguageCode': 'ru-RU',
  'LanguageName': 'Russian',
  'Name': 'Maxim'},
 {'Gender': 'Female',
  'Id': 'Carmen',
  'LanguageCode': 'ro-RO',
  'LanguageName': 'Romanian',
  'Name': 'Carmen'},
 {'Gender': 'Female',
  'Id': 'Ines',
  'LanguageCode': 'pt-PT',
  'LanguageName': 'Portuguese',
  'Name': 'Inês'},
 {'Gender': 'Male',
  'Id': 'Cristiano',
  'LanguageCode': 'pt-PT',
  'LanguageName': 'Portuguese',
  'Name': 'Cristiano'},
 {'Gender': 'Female',
  'Id': 'Vitoria',
  'LanguageCode': 'pt-BR',
  'LanguageName': 'Brazilian Portuguese',
  'Name': 'Vitória'},
 {'Gender': 'Male',
  'Id': 'Ricardo',
  'L

As we can see, there's a lot of possible voices to choose from (52 at the time of preparing this). To make our life a bit easier, `describe_voices` method accepts parameter `LanguageCode` for filtering response to only return voices in given language. `LanguageCode` consist of ISO 639 language code and ISO 3166 country code, separated by dash (`-`), for example: `pl-PL` or `en-US`. 

Let's try to get all Polish voices:

In [14]:
response = polly_client.describe_voices(LanguageCode='pl-PL')

print(len(response['Voices']))

pprint(response['Voices'])

4
[{'Gender': 'Female',
  'Id': 'Maja',
  'LanguageCode': 'pl-PL',
  'LanguageName': 'Polish',
  'Name': 'Maja'},
 {'Gender': 'Male',
  'Id': 'Jan',
  'LanguageCode': 'pl-PL',
  'LanguageName': 'Polish',
  'Name': 'Jan'},
 {'Gender': 'Male',
  'Id': 'Jacek',
  'LanguageCode': 'pl-PL',
  'LanguageName': 'Polish',
  'Name': 'Jacek'},
 {'Gender': 'Female',
  'Id': 'Ewa',
  'LanguageCode': 'pl-PL',
  'LanguageName': 'Polish',
  'Name': 'Ewa'}]


### Synthesize speech - creating audio file

In out first example, we will try out `synthesize_speech` method for creating `mp3` audio file from provided text. Polly can output audio in `mp3`, `ogg_vorbis` and `pcm`. We will use `en-US` VoiceId `Matthew`.

In [17]:
text = """
Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk,
and build entirely new categories of speech-enabled products.
Amazon Polly is a Text-to-Speech service that uses advanced deep learning technologies
to synthesize speech that sounds like a human voice.
With dozens of lifelike voices across a variety of languages,
you can select the ideal voice and build speech-enabled applications that work in many different countries.
"""

response = polly_client.synthesize_speech(
    OutputFormat='mp3',
    Text=text,
    VoiceId='Matthew'
)
pprint(response)

{'AudioStream': <botocore.response.StreamingBody object at 0x7f0daab1d0b8>,
 'ContentType': 'audio/mpeg',
 'RequestCharacters': '482',
 'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
                                      'content-type': 'audio/mpeg',
                                      'date': 'Sat, 24 Mar 2018 17:38:52 GMT',
                                      'transfer-encoding': 'chunked',
                                      'x-amzn-requestcharacters': '482',
                                      'x-amzn-requestid': '3839bb61-2f8a-11e8-bc44-abbcf813b055'},
                      'HTTPStatusCode': 200,
                      'RequestId': '3839bb61-2f8a-11e8-bc44-abbcf813b055',
                      'RetryAttempts': 0}}


In the response, we received `AudioStream`, which we will save to a file.

In [20]:
with closing(response['AudioStream']) as s:
    audio = s.read()
    with open('./en-US-polly-example.mp3', 'wb') as f:
        f.write(audio)

Now let's try to open the file and play it to test how well Polly did on our example text.

In [None]:
ipd.Audio('./en-US-polly-example.mp3')