tts coqui xtts_v2 not working without speaker_idx #2073

blob42 · 2024-04-19T02:20:27Z

Is your feature request related to a problem? Please describe.

I am trying to use the xtts_v2 coqui model but it seems unusable unless a speaker_idx is passed. The api call does not generate any audio.

Describe the solution you'd like

Either use the existing "voice" parameter and set the speaker_idx inside. If it is not an absolute path then it is considered a speaker.
Use a new parameter in the API call.

Extra testing

I manually tried running xtts_v2 and manually setting the speaker_idx and it seems running.

I am willing to tackle this task.

I guess it would also make sense to add some api endpoint to list available speakers.

The text was updated successfully, but these errors were encountered:

localai-bot · 2024-04-19T08:06:26Z

Hello! Thank you for submitting this feature request, and I'm happy to assist. Based on your description, it seems that the issue you're facing with the xtts_v2 Coqui model in LocalAI is related to the absence of audio generation when a speaker_idx is not passed. You would like to see two possible solutions implemented:

Utilize the existing "voice" parameter to set the speaker_idx internally, provided it's not considered an absolute path.
Introduce a new parameter in the API call specifically for speaker_idx.

Additionally, you suggest that adding an API endpoint to list available speakers would also be a useful addition. You're willing to contribute to this task, which is great!

As a bot, an experiment of @mudler, I'll make sure to relay this information to the appropriate team for consideration. We appreciate your contribution to enhancing LocalAI!

mudler · 2024-04-19T08:13:05Z

Is your feature request related to a problem? Please describe.

I am trying to use the xtts_v2 coqui model but it seems unusable unless a speaker_idx is passed. The api call does not generate any audio.

Describe the solution you'd like

Either use the existing "voice" parameter and set the speaker_idx inside. If it is not an absolute path then it is considered a speaker.

I can confirm voice isn't considered an absolute path, so would make sense to me to reuse it here (and just add documentation as needed)

blob42 added the enhancement New feature or request label Apr 19, 2024

blob42 changed the title ~~tts coqui xtts_v2: possibliity to pass the speaker_idx~~ tts coqui xtts_v2 not working without speaker_idx Apr 19, 2024

blob42 mentioned this issue Apr 20, 2024

TTS API improvements #2086

Closed

6 tasks

blob42 mentioned this issue May 13, 2024

TTS API improvements #2308

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tts coqui xtts_v2 not working without speaker_idx #2073

tts coqui xtts_v2 not working without speaker_idx #2073

blob42 commented Apr 19, 2024 •

edited

localai-bot commented Apr 19, 2024

mudler commented Apr 19, 2024 •

edited

tts coqui xtts_v2 not working without speaker_idx #2073

tts coqui xtts_v2 not working without speaker_idx #2073

Comments

blob42 commented Apr 19, 2024 • edited

Extra testing

localai-bot commented Apr 19, 2024

mudler commented Apr 19, 2024 • edited

blob42 commented Apr 19, 2024 •

edited

mudler commented Apr 19, 2024 •

edited