Skip to content

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.

License

Notifications You must be signed in to change notification settings

mediatechlab/tts-wrapper

Repository files navigation

TTS-Wrapper

PyPI version build codecov Maintainability

Contributions are welcome! Check our contribution guide.

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.

Currently the following services are supported:

  • AWS Polly
  • Google TTS
  • Microsoft TTS
  • IBM Watson
  • PicoTTS
  • SAPI (Microsoft Speech API)

Installation

Install using pip.

pip install TTS-Wrapper

Note: for each service you want to use, you have to install the required packages.

Example: to use google and watson:

pip install TTS-Wrapper[google, watson]

For PicoTTS you need to install the package on your machine. For Debian (Ubuntu and others) install the package libttspico-utils and for Arch (Manjaro and others) there is a package called aur/pico-tts.

Usage

Simply instantiate an object from the desired service and call synth().

from tts_wrapper import PollyTTS, PollyClient

tts = PollyTTS(client=PollyClient())
tts.synth('<speak>Hello, world!</speak>', 'hello.wav')

Notice that you must create a client object to work with your service. Each service uses different authorization techniques. Check out the documentation to learn more.

Selecting a Voice

You can change the default voice and lang like this:

PollyTTS(voice='Camila', lang='pt-BR')

Check out the list of available voices for Polly, Google, Microsoft, and Watson.

SSML

You can also use SSML markup to control the output of compatible engines.

tts.synth('<speak>Hello, <break time="3s"/> world!</speak>', 'hello.wav')

It is recommended to use the ssml attribute that will create the correct boilerplate tags for each engine:

tts.synth(tts.ssml.add('Hello, <break time="3s"/> world!'), 'hello.wav')

Learn which tags are available for each service: Polly, Google, Microsoft, and Watson.

Authorization

To setup credentials to access each engine, create the respective client.

Polly

If you don't explicitly define credentials, boto3 will try to find them in your system's credentials file or your environment variables. However, you can specify them with a tuple:

from tts_wrapper import PollyClient
client = PollyClient(credentials=(region, aws_key_id, aws_access_key))

Google

Point to your Oauth 2.0 credentials file path:

from tts_wrapper import GoogleClient
client = GoogleClient(credentials='path/to/creds.json')

Microsoft

Just provide your subscription key, like so:

from tts_wrapper import MicrosoftClient
client = MicrosoftClient(credentials='TOKEN')

If your region is not the default "useast", you can change it like so:

client = MicrosoftClient(credentials='TOKEN', region='brazilsouth')

Watson

Pass your API key and URL to the initializer:

from tts_wrapper import WatsonClient
client = WatsonClient(credentials=('API_KEY', 'API_URL'))

PicoTTS & SAPI

These clients dont't require authorization since they run offline.

from tts_wrapper import PicoClient, SAPIClient
client = PicoClient()
# or
client = SAPIClient()

File Format

By default, all audio will be a wave file but you can change it to a mp3 using the format option:

tts.synth('<speak>Hello, world!</speak>', 'hello.mp3', format='mp3)

License

Licensed under the MIT License.