[Feature request] prosody rate, style emotions, expressiveness, aggressiveness, pace, etc. #437

andrewarrow · 2021-04-18T13:06:05Z

The resemble.ai system has markup like:

<prosody rate="45%"><style emotions="expressiveness:0.9
aggressiveness:0.5 pace:0.2">
<say-as interpret-as="characters">Zeuxis</style></say-as>

Is this open sourced in coqui?

The text was updated successfully, but these errors were encountered:

AndrewBarfield · 2021-04-18T16:11:20Z

I've been thinking about the same. Especially speech rate.

I've also come across some text that isn't read correctly, like number ranges (i.e., 400-750) and acronyms (i.e., MPH). This could be interpreted correctly via mark-up configuration.

erogol · 2021-04-19T22:32:40Z

This level of detail is not possible with coqui TTS yet due to the limits of the open datasets.

Depending on which model you use, it might struggle with the acronyms and numbers too.

These are limitations due to the use of a publicly available dataset. Most commercial systems use specially created TTS datasets.

AndrewBarfield · 2021-04-19T22:37:27Z

For numerics and acronyms, we can simply preprocess the string before synthesizing using search and replace or regex.

This is no show stopper.

erogol · 2021-04-20T08:58:01Z

That's true. Some of the models we release use Phonemes and a text front-end to do the work. You might like to try them.

The only model that only use characters is tts_models/en/ljspeech/tacotron2-DDC the rest is more robust to such variations.

Hopefully we'll update this mode soon to use a more advance front end.

andrewarrow added the feature request feature requests for making TTS better. label Apr 18, 2021

erogol closed this as completed Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] prosody rate, style emotions, expressiveness, aggressiveness, pace, etc. #437

[Feature request] prosody rate, style emotions, expressiveness, aggressiveness, pace, etc. #437

andrewarrow commented Apr 18, 2021 •

edited

AndrewBarfield commented Apr 18, 2021

erogol commented Apr 19, 2021

AndrewBarfield commented Apr 19, 2021

erogol commented Apr 20, 2021

[Feature request] prosody rate, style emotions, expressiveness, aggressiveness, pace, etc. #437

[Feature request] prosody rate, style emotions, expressiveness, aggressiveness, pace, etc. #437

Comments

andrewarrow commented Apr 18, 2021 • edited

AndrewBarfield commented Apr 18, 2021

erogol commented Apr 19, 2021

AndrewBarfield commented Apr 19, 2021

erogol commented Apr 20, 2021

andrewarrow commented Apr 18, 2021 •

edited