This is a wrapper library for text to speech api services below:
- Microsoft Azure Text to Speech API
- IBM Watson Text to Speech API
- Amazon Polly
- Google Cloud Text-to-Speech
Register accounts to use them and get credentials.
from cloudtts import XXXClient, XXXCredential
cred = XXXCredential(aaa='aaa', bbb='bbb')
c = XXXClient(cred)
or
from cloudtts import XXXClient, XXXCredential
c = XXXClient()
cred = XXXCredential(aaa='aaa', bbb='bbb')
c.auth(cred)
You can use AzureClient, GoogleClient, PollyClient and WatsonClient for XXXClient.
audio = c.tts('Hello world!')
tts() method receives three arguments.
text
(required) : String to be synthesized.voice_config
(optional) : Configuration to synthesize text. VoiceConfig is described in following section.detail
(optional) : Parameters to synthesize text.
There are two parameters to configure voice which are voice_config and detail.
voice_config is an instance of VoiceConfig class. It is instantiated with three parametes.
- audio_format : You can specify audio format. Available formats and services are listed below. Default value is mp3.
- cloudtts.AudioFormat.mp3 : Azure, Google, Polly, Watson
- cloudtts.AudioFormat.ogg_opus : Google, Watson
- cloudtts.AudioFormat.ogg_vorbis : Polly, Watson
- cloudtts.AudioFormat.pcm : Azure, Polly
- gender : You can specify gender of voice. Available genders are listed below. Default value is female.
- cloudtts.Gender.male
- cloudtts.Gender.female
- language : You can specify language of voice. Available languages are listed below. Default value is en_US.
- cloudtts.Language.da_DK : Azure / Polly
- cloudtts.Language.de_DE : Azure / Google / Polly / Watson
- cloudtts.Language.en_AU : Azure / Google / Polly
- cloudtts.Language.en_GB : Azure / Google / Polly / Watson
- cloudtts.Language.en_IN : Azure / Polly
- cloudtts.Language.en_US : Azure / Google / Polly / Watson
- cloudtts.Language.es_ES : Azure / Google / Polly / Watson
- cloudtts.Language.es_US : Polly / Watson
- cloudtts.Language.fr_CA : Azure / Google / Polly
- cloudtts.Language.fr_FR : Azure / Google / Polly / Watson
- cloudtts.Language.hi_IN : Azure / Polly
- cloudtts.Language.it_IT : Azure / Google / Polly / Watson
- cloudtts.Language.ja_JP : Azure / Google / Polly / Watson
- cloudtts.Language.ko_KR : Azure / Google / Polly
- cloudtts.Language.nb_NO : Azure / Polly
- cloudtts.Language.nl_NL : Azure / Google / Polly
- cloudtts.Language.pl_PL : Azure / Polly
- cloudtts.Language.pt_BR : Azure / Google / Polly / Watson
- cloudtts.Language.pt_PT : Azure / Polly
- cloudtts.Language.ro_RO : Azure / Polly
- cloudtts.Language.ru_RU : Azure / Polly
- cloudtts.Language.sv_SE : Azure / Google / Polly
- cloudtts.Language.tr_TR : Azure / Google / Polly
detail is a dictionary. You can specifiy service specific key-values, like
"language": "zh-CN", "gender": "female", "voice": "Yaoyao, Apollo"
for Azure"gender": texttospeech.enums.SsmlVoiceGender.FEMALE
for Google"sample_rate": "16000"
for Polly"voice": "en-US_LisaVoice"
for Watson
detail overwrites values which are generated by VoiceConfig.
It is required to set api_key
for AzureCredential.
AzureClient's tts() supports both plain text and SSML for text
.
Credential for GoogleClient is a file path which you downloaded from Google Cloud Console.
If you want to synthesize SSML with tts(), set value to ssml
.
audio = c.tts(ssml='<speak>Hello <break time="300ms" /> world</speak>')
It is required to set region_name
for PollyCredential. You can set aws_access_key_id
and aws_secret_access_key
for it.
If you want to synthesize SSML with tts(), set value to ssml
.
audio = c.tts(ssml='<speak>Hello <break time="300ms" /> world</speak>')
It is required to set username
, password
, and url
for WatsonCredential.
WatsonClient's tts() supports both plain text and SSML for text
.
Please check sample.py!
You can construct an environment to develop cloudtts.
$ pip install --requirement requirements.txt
Before submitting, please run unittest.
$ python -m unittest