Why Azure TTS?

Text-to-speech from Azure Speech Services is a service that enables your applications, tools, or devices to convert text into natural human-like synthesized speech. Choose from standard and neural voices, or create your own custom voice unique to your product or brand. 75+ standard voices are available in more than 45 languages and locales, and 5 neural voices are available in 4 languages and locales. For a full list, see supported languages.

Overall Azure TTS is very suitable for various text to speech scenario.

It has leading language coverage.
It is highly intelligible across all locales which is essential for translation
In some locales, we have applied latest neural technology update, so the naturalness is also great. We are keeping expansion onto more locales in coming months.
It offers customization voice to create best-in-class and unique voice for a brand.

Azure TTS uses latest neural TTS innovations like sequence to sequence neural acoustic models and neural vocoders.

There are two major quality metrics for TTS.

Naturalness: how the voice compare to human speech. We use MOS = mean opinion score to measure TTS naturalness
Intelligibility: whether the readout is understandable to human. We use multiple judges to give the judgements.

Overall the metrics on naturalness and intelligibility are highly competitive.

The quality also speaks for itself. There are multiple partners in 1st and 3rd party chosen to use Azure TTS. See more on - Where can I see the demo or app using Azure TTS service?

Azure TTS: Empower every person and every organization on the planet to have a delightful digital voice!
Azure Custom Voice: Build your one-of-a-kind Custom Voice and close to human Neural TTS in cloud and edge!

Azure Speech Document

Create Custom Neural Voice

Speech SDK

Azure Speech Containers

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why Azure TTS?

Clone this wiki locally