Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output is mumbled for a specific sentence and voice #128

Open
vcalv opened this issue Jul 4, 2023 · 4 comments
Open

output is mumbled for a specific sentence and voice #128

vcalv opened this issue Jul 4, 2023 · 4 comments

Comments

@vcalv
Copy link

vcalv commented Jul 4, 2023

If you use the voice en_US-ryan to synthesize the sentence My general knowledge of Smith and his ideas has also benefited from numerous conversations about Smith with Don Boudreaux of George Mason and James Otteson while he was at Yeshiva University

it just mumbles almost every word after knowledge.

This is a problem with this specific voice and not any other that I've tested with, suggesting the phoneme conversion is OK.

piper.mp4
@shuja-u
Copy link

shuja-u commented Jul 9, 2023

I can confirm having the same issue with that voice

@vcalv
Copy link
Author

vcalv commented Jul 9, 2023

It's completely bananas, because if you try the exact same sentence with another US-english voice it's near perfect.

Example with en_US-joe-medium.

en_US-joe-medium.mp4

@vcalv vcalv changed the title output is just mumbled for a specific sentene and voice output is mumbled for a specific sentence and voice Jul 9, 2023
@synesthesiam
Copy link
Contributor

Very odd, I can confirm this is a problem with the medium/high quality Ryan voices. The low quality voice seems to work, though. My only guess is that (1) there are bad training samples in the Ryan dataset, and (2) this text similar one of those bad samples.

@erew123
Copy link

erew123 commented Jun 7, 2024

Thank god... I thought it was me going nuts! en_US-ryan-high.onnx

I have had very strange results with this voice. I thought it was a certain character length that causes the issues, but I'm not sure on that I could be wrong. It can sound like a man trying to the "worlds fastest" rapper impression at times... and strangely, sometimes it does work ok. I'm just adding a bit more information here for others whom may experience this.

I was about to go down a whole route of trying to figure out what was wrong with my system setup, but it looks like its just that one voice.

erew123 added a commit to erew123/alltalk_tts that referenced this issue Jun 11, 2024
Both those models have issues with generation rhasspy/piper#128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants