Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tts coqui xtts_v2 not working without speaker_idx #2073

Open
blob42 opened this issue Apr 19, 2024 · 2 comments
Open

tts coqui xtts_v2 not working without speaker_idx #2073

blob42 opened this issue Apr 19, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@blob42
Copy link
Contributor

blob42 commented Apr 19, 2024

Is your feature request related to a problem? Please describe.

I am trying to use the xtts_v2 coqui model but it seems unusable unless a speaker_idx is passed. The api call does not generate any audio.

Describe the solution you'd like

  1. Either use the existing "voice" parameter and set the speaker_idx inside. If it is not an absolute path then it is considered a speaker.
  2. Use a new parameter in the API call.

Extra testing

I manually tried running xtts_v2 and manually setting the speaker_idx and it seems running.

I am willing to tackle this task.

I guess it would also make sense to add some api endpoint to list available speakers.

@blob42 blob42 added the enhancement New feature or request label Apr 19, 2024
@blob42 blob42 changed the title tts coqui xtts_v2: possibliity to pass the speaker_idx tts coqui xtts_v2 not working without speaker_idx Apr 19, 2024
@localai-bot
Copy link
Contributor

Hello! Thank you for submitting this feature request, and I'm happy to assist. Based on your description, it seems that the issue you're facing with the xtts_v2 Coqui model in LocalAI is related to the absence of audio generation when a speaker_idx is not passed. You would like to see two possible solutions implemented:

  1. Utilize the existing "voice" parameter to set the speaker_idx internally, provided it's not considered an absolute path.
  2. Introduce a new parameter in the API call specifically for speaker_idx.

Additionally, you suggest that adding an API endpoint to list available speakers would also be a useful addition. You're willing to contribute to this task, which is great!

As a bot, an experiment of @mudler, I'll make sure to relay this information to the appropriate team for consideration. We appreciate your contribution to enhancing LocalAI!

@mudler
Copy link
Owner

mudler commented Apr 19, 2024

Is your feature request related to a problem? Please describe.

I am trying to use the xtts_v2 coqui model but it seems unusable unless a speaker_idx is passed. The api call does not generate any audio.

Describe the solution you'd like

  1. Either use the existing "voice" parameter and set the speaker_idx inside. If it is not an absolute path then it is considered a speaker.

I can confirm voice isn't considered an absolute path, so would make sense to me to reuse it here (and just add documentation as needed)

@blob42 blob42 mentioned this issue Apr 20, 2024
6 tasks
@blob42 blob42 mentioned this issue May 13, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants