Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

Open
silentoplayz opened this issue Mar 28, 2024 · 5 comments
Labels
core core feature enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@silentoplayz
Copy link
Collaborator

Problem Description:
The current version of Open WebUI lacks the necessary customization options for the text-to-speech (TTS) output, including volume control, speech rate adjustment, pitch adjustment, and audio playback functionality for speaking out notifications. These limitations hinder the user experience and accessibility of the text-to-speech (TTS) feature.

Describe the solution you'd like:
I propose the implementation of the following features to enhance the TTS output customization:

  1. A volume control slider to adjust the volume of the TTS output.
  2. A "Speech Rate" slider to adjust the speed of the TTS output.
  3. A "Pitch" slider enabling users to modify the voice pitch of the TTS output.
  4. An option to enable or disable audio playback for speaking out notifications.

Alternative solution:
Offer predefined volume, speed, & pitch options instead of a slider for a simpler interface.

Alternatives Considered:
Manually adjusting the device's overall volume or utilizing third-party applications to manipulate speech output and volume settings represents a workaround. However, this solution proves to be inconvenient for users, necessitating the addition of these much-needed features within Open WebUI.

Additional Context:
This feature request focuses on improving the text-to-speech (TTS) feature's accessibility and overall user experience. Implementing these requested features, including volume, speed, and pitch adjustments, will significantly enhance user satisfaction and convenience. It's crucial to maintain compatibility with existing features, ensuring this customization suite does not adversely impact any existing functionalities or behaviors.

@dannyl1u
Copy link
Collaborator

dannyl1u commented Apr 8, 2024

I think this is would be a good feature, how does this look for the UI?

image

@dannyl1u dannyl1u mentioned this issue Apr 8, 2024
4 tasks
@silentoplayz
Copy link
Collaborator Author

silentoplayz commented Apr 8, 2024

That looks good to me @dannyl1u, although, do you think the sliders could take on a similar form as the model advanced parameter sliders? I only ask because I feel that tjbck would step in to ask the same thing eventually or even make the adjustment himself.

Screenshot 2024-03-16 141517

P.S: Thank you for your contributions to Open WebUI!

@dannyl1u
Copy link
Collaborator

dannyl1u commented Apr 8, 2024

That looks good to me @dannyl1u, although, do you think the sliders could take on a similar form as the model advanced parameter sliders? I only ask because I feel that tjbck would step in to ask the same thing eventually or even make the adjustment themselves.

Screenshot 2024-03-16 141517

P.S: Thank you for your contributions to Open WebUI!

Yes! Thanks for the suggestion, I forgot those sliders existed 😆 , that's definitely the better UI and I'll reuse that!

@silentoplayz silentoplayz added enhancement New feature or request core core feature labels Apr 19, 2024
@UXVirtual
Copy link

@dannyl1u another challenge with TTS output I've noticed is generated markdown code blocks are spoken out audibly.

Making this a toggle option, and stripping the code block prior to the extractSentences() call if it is toggled on would help with coding assistant use-cases.

@littledot2020
Copy link

@dannyl1u我注意到的 TTS 输出的另一个挑战是生成的 markdown 代码块是以声音形式读出的。

将其设为切换选项,并在extractSentences()切换后剥离调用之前的代码块,这将有助于编码助手用例。
I also want to know how to play content formatted after converting markdown.

@silentoplayz silentoplayz added help wanted Extra attention is needed good first issue Good for newcomers labels Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core core feature enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants