Skip to content

Why Azure TTS?

szhaomsft edited this page Dec 28, 2019 · 2 revisions

Text-to-speech from Azure Speech Services is a service that enables your applications, tools, or devices to convert text into natural human-like synthesized speech. Choose from standard and neural voices, or create your own custom voice unique to your product or brand. 75+ standard voices are available in more than 45 languages and locales, and 5 neural voices are available in 4 languages and locales. For a full list, see supported languages.

Overall Azure TTS is very suitable for various text to speech scenario.

  • It has leading language coverage.
  • It is highly intelligible across all locales which is essential for translation
  • In some locales, we have applied latest neural technology update, so the naturalness is also great. We are keeping expansion onto more locales in coming months.
  • It offers customization voice to create best-in-class and unique voice for a brand.

Azure TTS uses latest neural TTS innovations like sequence to sequence neural acoustic models and neural vocoders.

There are two major quality metrics for TTS.

  • Naturalness: how the voice compare to human speech. We use MOS = mean opinion score to measure TTS naturalness
  • Intelligibility: whether the readout is understandable to human. We use multiple judges to give the judgements.

Overall the metrics on naturalness and intelligibility are highly competitive.

The quality also speaks for itself. There are multiple partners in 1st and 3rd party chosen to use Azure TTS. See more on - Where can I see the demo or app using Azure TTS service?