SSML Support? #275

duplaja · 2023-11-19T01:59:45Z

I apologize if I missed this somewhere in the documentation, but does the python install support SSML? I'm converting some of my old scripts over from Mimic3, and didn't see any flag like Mimic3's --ssml.

Alternatively, what's the best way to add a break between paragraphs, if SSML is not supported?

Thank you!

nptrainor · 2023-12-31T20:20:50Z

This would be incredibly useful.

Any thoughts on when/whether this is in train?

synesthesiam · 2024-01-14T05:42:29Z

This is planned, but I haven't made any progress yet. Adding pauses and changing the playback speed is easy, but switching voices will require more changes.

DaveXanatos · 2024-05-08T02:20:16Z

@synesthesiam Another vote for SSML here. I'm especially interested in emphasizing some words/phrases. FWIW I wrote a script that allows embedding of a voice name in the speech text string that will switch the voice. I am happy to share that if there's interest. My plan is to have a "Speech Center" up and running on ZeroMQ (or any messaging thing like MQTT, Rabbit, etc) and different scripts would be able to send text to be spoken, with embedded voice commands, to the speech center via messaging. So far - these voices are AWESOME and the script I'm using makes them incredibly simple to switch between. Thank you for making all these available.

fantnhu · 2024-05-20T13:09:37Z

That would be great :) I've been looking forward to SSML and other supplements for a long time. I don't want to use any other TTS because I am satisfied.
The SSMl, pause hold and custom tags would improve the experience a lot: [laughter], [laughs], [sighs]... like in Bark TTS. If you can first solve the pause with similar parameters, that would be great, e.g. [wait=2s]
Thanks for your work!

synesthesiam · 2024-05-23T01:53:44Z

I'm finally making some progress on SSML. The next version of Piper should support breaks (pauses), word/phoneme substitutions, and some say-as forms (number, date, etc.).

I can't do laughter and sighs, unfortunately. Those would have had to be present in the original datasets.

DaveXanatos · 2024-05-23T02:12:29Z

This is excellent news! I've been creating some form of emphasis by adding a slight bit of time to the --length_scale and --sentence_silence parameters, but pauses and say-as are very welcome additions!!! Thanks!

andrewfr · 2024-06-16T00:15:50Z

@synesthesiam I'm using Piper and think it is great! I too, am interested in SSM. I admit I know next to nothing about speech synthesis. I am having problems finding references on implementing SSML. Would it help to start by training a voice? Or would it be more feasible to apply a SSML feature to the audio stream? Closely connected, is how to implement "speech marks" - when a word starts and finishes in the audio.

Thanks,
Andrew

nitinthewiz · 2024-06-18T18:19:23Z

Hey @synesthesiam! Looking forward to SSML support, specially to solve #401
I was trying to get lessac high to say "COVID-19" properly, but to no avail. I reckon only SSML or IPA will make it say such things correctly.

andrewfr · 2024-06-18T18:39:48Z

Hi @nitinthewiz @synesthesiam I am a newbie to all this. I have looked at #401. This may turn out to be a long shot. I am starting to learn Praat. Praat is used by linguists to study speech. Using Praat, if I can modify Piper's output to match what a SSML tag would do, that could be a start.

nptrainor · 2024-07-02T08:10:06Z

@synesthesiam - That progress sounds fabulous - thank you so much.
It would be wonderful to change voices too. I am enjoying writing my own novels (stories to you and me) and using Piper to read them back to me. It is such a great help for editing etc, and having different voices for different characters would be superb.

andrewfr · 2024-08-21T03:47:09Z

@synesthesiam @nitinthewiz I am not sure if this helps. I started to play with pyDub. I needed to insert some silence to act as a break. I played with pyDub to alter an audio stream's pitch and volume. It is really dawning on me that a lot of SSML, provided one works on a sentence level, can be done in some post synthesis stage.

DaveXanatos · 2024-08-21T12:22:13Z

Much can be done post synthesis, yes, unless the synthesis needs to output real time in a responsive or conversational system (such as a robot running some type of LLM to engage with people on its environment.) This is why I'm really looking forward to SSML support in Piper.

…

________________________________ From: andrewfr ***@***.***> Sent: Tuesday, August 20, 2024 11:47:33 PM To: rhasspy/piper ***@***.***> Cc: xanatos xanatos.com ***@***.***>; Comment ***@***.***> Subject: Re: [rhasspy/piper] SSML Support? (Issue #275) @synesthesiam<https://github.com/synesthesiam> @nitinthewiz<https://github.com/nitinthewiz> I am not sure if this helps. I started to play with pyDub. I needed to insert some silence to act as a break. I played with pyDub to alter an audio stream's pitch and volume. It is really dawning on me that a lot of SSML, provided one works on a sentence level, can done some post synthesis stage. — Reply to this email directly, view it on GitHub<#275 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGLWB6CEFCJ2FKTTH255AA3ZSQEVLAVCNFSM6AAAAAA7RMC5CGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBQGYZTOMRWG4>. You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSML Support? #275

SSML Support? #275

duplaja commented Nov 19, 2023

nptrainor commented Dec 31, 2023

synesthesiam commented Jan 14, 2024

DaveXanatos commented May 8, 2024

fantnhu commented May 20, 2024

synesthesiam commented May 23, 2024

DaveXanatos commented May 23, 2024 via email •

edited

Loading

andrewfr commented Jun 16, 2024 •

edited

Loading

nitinthewiz commented Jun 18, 2024

andrewfr commented Jun 18, 2024 •

edited

Loading

nptrainor commented Jul 2, 2024

andrewfr commented Aug 21, 2024 •

edited

Loading

DaveXanatos commented Aug 21, 2024 via email

SSML Support? #275

SSML Support? #275

Comments

duplaja commented Nov 19, 2023

nptrainor commented Dec 31, 2023

synesthesiam commented Jan 14, 2024

DaveXanatos commented May 8, 2024

fantnhu commented May 20, 2024

synesthesiam commented May 23, 2024

DaveXanatos commented May 23, 2024 via email • edited Loading

andrewfr commented Jun 16, 2024 • edited Loading

nitinthewiz commented Jun 18, 2024

andrewfr commented Jun 18, 2024 • edited Loading

nptrainor commented Jul 2, 2024

andrewfr commented Aug 21, 2024 • edited Loading

DaveXanatos commented Aug 21, 2024 via email

DaveXanatos commented May 23, 2024 via email •

edited

Loading

andrewfr commented Jun 16, 2024 •

edited

Loading

andrewfr commented Jun 18, 2024 •

edited

Loading

andrewfr commented Aug 21, 2024 •

edited

Loading