[Feature request] add stopnet delay argument to synthesis function (tacotron) #440
Labels
feature request
feature requests for making TTS better.
wontfix
This will not be worked on but feel free to help.
Sometimes synthesis for some sentences are cut short at the last word. I know (think) that it's indicative that something is amiss in the model or the dataset, either not trained long enough, audio parameters could be tuned further (trim_db ?) or just dataset quality. But taking time to fix that issue, debugging and training many models is a luxury that some people can't afford (maybe even more if it's a low ressource language).
I would gladly do a PR to propose the feature but I'm not sure how to go about the implementation.
Would adding a stopnet delay (delaying from n steps the stop signal) solve this issue ?
The text was updated successfully, but these errors were encountered: