feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

silentoplayz · 2024-03-28T02:28:41Z

Problem Description:
The current version of Open WebUI lacks the necessary customization options for the text-to-speech (TTS) output, including volume control, speech rate adjustment, pitch adjustment, and audio playback functionality for speaking out notifications. These limitations hinder the user experience and accessibility of the text-to-speech (TTS) feature.

Describe the solution you'd like:
I propose the implementation of the following features to enhance the TTS output customization:

A volume control slider to adjust the volume of the TTS output.
A "Speech Rate" slider to adjust the speed of the TTS output.
A "Pitch" slider enabling users to modify the voice pitch of the TTS output.
An option to enable or disable audio playback for speaking out notifications.

Alternative solution:
Offer predefined volume, speed, & pitch options instead of a slider for a simpler interface.

Alternatives Considered:
Manually adjusting the device's overall volume or utilizing third-party applications to manipulate speech output and volume settings represents a workaround. However, this solution proves to be inconvenient for users, necessitating the addition of these much-needed features within Open WebUI.

Additional Context:
This feature request focuses on improving the text-to-speech (TTS) feature's accessibility and overall user experience. Implementing these requested features, including volume, speed, and pitch adjustments, will significantly enhance user satisfaction and convenience. It's crucial to maintain compatibility with existing features, ensuring this customization suite does not adversely impact any existing functionalities or behaviors.

dannyl1u · 2024-04-08T05:25:49Z

I think this is would be a good feature, how does this look for the UI?

silentoplayz · 2024-04-08T06:21:08Z

That looks good to me @dannyl1u, although, do you think the sliders could take on a similar form as the model advanced parameter sliders? I only ask because I feel that tjbck would step in to ask the same thing eventually or even make the adjustment himself.

P.S: Thank you for your contributions to Open WebUI!

dannyl1u · 2024-04-08T06:49:05Z

That looks good to me @dannyl1u, although, do you think the sliders could take on a similar form as the model advanced parameter sliders? I only ask because I feel that tjbck would step in to ask the same thing eventually or even make the adjustment themselves.

P.S: Thank you for your contributions to Open WebUI!

Yes! Thanks for the suggestion, I forgot those sliders existed 😆 , that's definitely the better UI and I'll reuse that!

UXVirtual · 2024-04-26T23:16:42Z

@dannyl1u another challenge with TTS output I've noticed is generated markdown code blocks are spoken out audibly.

Making this a toggle option, and stripping the code block prior to the extractSentences() call if it is toggled on would help with coding assistant use-cases.

littledot2020 · 2024-05-30T11:53:03Z

@dannyl1u我注意到的 TTS 输出的另一个挑战是生成的 markdown 代码块是以声音形式读出的。

将其设为切换选项，并在extractSentences()切换后剥离调用之前的代码块，这将有助于编码助手用例。
I also want to know how to play content formatted after converting markdown.

dannyl1u mentioned this issue Apr 8, 2024

feat: TTS output controls #1456

Closed

4 tasks

silentoplayz added enhancement New feature or request core core feature labels Apr 19, 2024

silentoplayz mentioned this issue May 11, 2024

enhancement: TTS skip code sections #2152

Open

silentoplayz added help wanted Extra attention is needed good first issue Good for newcomers labels Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

silentoplayz commented Mar 28, 2024

dannyl1u commented Apr 8, 2024

silentoplayz commented Apr 8, 2024 •

edited

dannyl1u commented Apr 8, 2024

UXVirtual commented Apr 26, 2024

littledot2020 commented May 30, 2024

feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

feat: Volume, Speech Rate, and Pitch Controls for Text-to-Speech (TTS) Output #1331

Comments

silentoplayz commented Mar 28, 2024

dannyl1u commented Apr 8, 2024

silentoplayz commented Apr 8, 2024 • edited

dannyl1u commented Apr 8, 2024

UXVirtual commented Apr 26, 2024

littledot2020 commented May 30, 2024

silentoplayz commented Apr 8, 2024 •

edited