You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello together,
I am currently trying to use OpenVoice for German language generation. I have not been able to figure out how this zero shot speech synthesis shall work. Is there some kind of multilanguage base model missing? When I use one of the language dependent base models things sound weird.
It would also be interesting if someone could explain how the different emotions/speech styles can be controlled. The documentation of the API could benefit from some more examples.
The text was updated successfully, but these errors were encountered:
the text to speech synthesis in v1 is powered with openAI TTS system, the v2 is via MeloTTS. the v2 sounds more improved from my experience.
on first run, the models will be loaded automatically to your system and OpenVoice performs tone color conversion on the synthesized audio. here is the demo set up
Hello together,
I am currently trying to use OpenVoice for German language generation. I have not been able to figure out how this zero shot speech synthesis shall work. Is there some kind of multilanguage base model missing? When I use one of the language dependent base models things sound weird.
It would also be interesting if someone could explain how the different emotions/speech styles can be controlled. The documentation of the API could benefit from some more examples.
The text was updated successfully, but these errors were encountered: