[Feature request] add stopnet delay argument to synthesis function (tacotron) #440

WeberJulian · 2021-04-21T09:36:48Z

Sometimes synthesis for some sentences are cut short at the last word. I know (think) that it's indicative that something is amiss in the model or the dataset, either not trained long enough, audio parameters could be tuned further (trim_db ?) or just dataset quality. But taking time to fix that issue, debugging and training many models is a luxury that some people can't afford (maybe even more if it's a low ressource language).

I would gladly do a PR to propose the feature but I'm not sure how to go about the implementation.
Would adding a stopnet delay (delaying from n steps the stop signal) solve this issue ?

erogol · 2021-04-21T11:44:17Z

do you think it would solve the problem for all the occurrences?

Then you might need to tune that delay per sample.

In general, there are two tricks I also use:

stopnet delay. Maybe delay longer than it needs and trim the silence.
Don't use stopnet but look at the attention map and signal stop when the attention reaches the last token.

WeberJulian · 2021-04-21T12:08:01Z

Then you might need to tune that delay per sample.

I was thinking that in the worst scenario it would add half a second of silence at the end of the sample (but I never actually tried)

Don't use stopnet but look at the attention map and signal stop when the attention reaches the last token.

That approach sounds interesting, is it implemented yet ?

erogol · 2021-04-21T12:17:27Z

That approach sounds interesting, is it implemented yet?

implemented once a long ago but don't know now where :)

WeberJulian · 2021-04-21T12:58:55Z

implemented once a long ago but don't know now where :)

Haha ok, I'm gonna look for it and try both approaches

stale · 2021-05-21T13:07:30Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

WeberJulian added the feature request feature requests for making TTS better. label Apr 21, 2021

stale bot added the wontfix This will not be worked on but feel free to help. label May 21, 2021

stale bot closed this as completed May 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] add stopnet delay argument to synthesis function (tacotron) #440

[Feature request] add stopnet delay argument to synthesis function (tacotron) #440

WeberJulian commented Apr 21, 2021

erogol commented Apr 21, 2021

WeberJulian commented Apr 21, 2021

erogol commented Apr 21, 2021

WeberJulian commented Apr 21, 2021

stale bot commented May 21, 2021

[Feature request] add stopnet delay argument to synthesis function (tacotron) #440

[Feature request] add stopnet delay argument to synthesis function (tacotron) #440

Comments

WeberJulian commented Apr 21, 2021

erogol commented Apr 21, 2021

WeberJulian commented Apr 21, 2021

erogol commented Apr 21, 2021

WeberJulian commented Apr 21, 2021

stale bot commented May 21, 2021