Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elevenlabs TTS websocket connection design #306

Closed
fjprobos opened this issue May 21, 2024 · 3 comments
Closed

Elevenlabs TTS websocket connection design #306

fjprobos opened this issue May 21, 2024 · 3 comments

Comments

@fjprobos
Copy link

fjprobos commented May 21, 2024

Hi,

I was able to make the minimal_assistant.py implementation work. Once I sorted out all the difficulties, it runs pretty well! Kudos for that 😃.

I have a question regarding the WebSocket connections used in the ElevenLabs TTS module. In my environment, I noticed that the WebSocket creation is being triggered every time the agent responds to the user. Consequently, the WebSocket is being closed every time the agent stops talking.

Questions:

  • Is this a design decision? If so, could you please explain the rationale behind it?

  • Is there a specific reason for not maintaining a persistent WebSocket connection throughout the session?

I believe closing and reopening the WebSocket repeatedly introduces unnecessary overhead. Maintaining one or a few stable connections throughout the session might be more efficient.

Looking forward to your insights on this.

Thank you!

@keepingitneil
Copy link
Contributor

This was a constraint of ElevenLabs. Additional text can't be sent on the same websocket after an EOS and the EOS signal is used to flush.

Looking at their docs now, it looks like they have since introduced a "flush" flag in their protocol which we can look into using.

With that being said, typically there is no additional latency introduced to the end-user with this strategy because the next websocket connection will have been connected long before speech generation is needed.

@fjprobos
Copy link
Author

fjprobos commented May 25, 2024 via email

parshvadaftari pushed a commit to parshvadaftari/agents that referenced this issue Jun 23, 2024
Co-authored-by: sweep-ai[bot] <128439645+sweep-ai[bot]@users.noreply.github.com>
@theomonnom
Copy link
Member

Hey, the 11labs websocket connection is initialized as soon as there is pushed text. The connection will then be closed on flush. (due to 11labs API limitations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants