output is mumbled for a specific sentence and voice #128

vcalv · 2023-07-04T17:59:18Z

If you use the voice en_US-ryan to synthesize the sentence My general knowledge of Smith and his ideas has also benefited from numerous conversations about Smith with Don Boudreaux of George Mason and James Otteson while he was at Yeshiva University

it just mumbles almost every word after knowledge.

This is a problem with this specific voice and not any other that I've tested with, suggesting the phoneme conversion is OK.

piper.mp4

The text was updated successfully, but these errors were encountered:

shuja-u · 2023-07-09T18:28:35Z

I can confirm having the same issue with that voice

vcalv · 2023-07-09T22:50:03Z

It's completely bananas, because if you try the exact same sentence with another US-english voice it's near perfect.

Example with en_US-joe-medium.

en_US-joe-medium.mp4

synesthesiam · 2023-07-10T20:45:26Z

Very odd, I can confirm this is a problem with the medium/high quality Ryan voices. The low quality voice seems to work, though. My only guess is that (1) there are bad training samples in the Ryan dataset, and (2) this text similar one of those bad samples.

erew123 · 2024-06-07T07:10:46Z

Thank god... I thought it was me going nuts! en_US-ryan-high.onnx

I have had very strange results with this voice. I thought it was a certain character length that causes the issues, but I'm not sure on that I could be wrong. It can sound like a man trying to the "worlds fastest" rapper impression at times... and strangely, sometimes it does work ok. I'm just adding a bit more information here for others whom may experience this.

I was about to go down a whole route of trying to figure out what was wrong with my system setup, but it looks like its just that one voice.

Both those models have issues with generation rhasspy/piper#128

vcalv changed the title ~~output is just mumbled for a specific sentene and voice~~ output is mumbled for a specific sentence and voice Jul 9, 2023

erew123 added a commit to erew123/alltalk_tts that referenced this issue Jun 11, 2024

Removed ryan high and medium

bc0ebc0

Both those models have issues with generation rhasspy/piper#128

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

output is mumbled for a specific sentence and voice #128

output is mumbled for a specific sentence and voice #128

vcalv commented Jul 4, 2023 •

edited

Loading

shuja-u commented Jul 9, 2023

vcalv commented Jul 9, 2023

synesthesiam commented Jul 10, 2023

erew123 commented Jun 7, 2024

output is mumbled for a specific sentence and voice #128

output is mumbled for a specific sentence and voice #128

Comments

vcalv commented Jul 4, 2023 • edited Loading

shuja-u commented Jul 9, 2023

vcalv commented Jul 9, 2023

synesthesiam commented Jul 10, 2023

erew123 commented Jun 7, 2024

vcalv commented Jul 4, 2023 •

edited

Loading