You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My feature request is related to several problems I am experiencing while using the current version of the speechgpt. I am frustrated when:
The keyboard remains visible even after completing my input, which takes up unnecessary screen space and makes it harder to read the chat.
The keyboard still shows up while I interact with the assistant using speech recognition, which is unnecessary in that scenario and can be distracting.
Many average users need clarification on setting the speech recognition/synthesis language and language ID. So, I prefer an easier way to do this through environment variables and let the average users use it more easily with default configurations.
When the assistant generates a lengthy response, I have to wait for the honest answer to be developed before I can listen or read it. Streaming output for both text and TTS would make this process smoother and more enjoyable.
I often want to replay the assistant's response or my input via TTS but cannot curate more so, which can be inconvenient when I need to review previous interactions.
Describe the solution you'd like
Hide the keyboard after the user completes input and show back again after ChatGPT completes the response. This repo: ddiu8081/chatgpt-demo achieved this well. You can look around it if you like.
Do not show the keyboard when the user interacts with the assistant via speech recognition
Ability to set default speech recognition/synthesis language & language ID via environment variables. (As many average users find setting these at first a few confusing)
Assistant response streaming output, if it is possible, + streaming TTS output (This is very helpful when the assistant generates a long response)
Ability to replay the assistant response or the other input via the TTS engine
Additional context
No response
The text was updated successfully, but these errors were encountered:
My plan is to add an option that allows users to choose whether to display the keyboard during speech recognition, as speech recognition may produce errors, and displaying the keyboard would allow users to quickly correct mistakes.
Different services have different supported languages and voices, so using environment variables for configuration might be complicated.
Currently, I have not found any TTS API that supports streaming. A possible solution is to split the assistant's responses into multiple sentences and send multiple requests.
Is your feature request related to a problem?
My feature request is related to several problems I am experiencing while using the current version of the speechgpt. I am frustrated when:
Describe the solution you'd like
Additional context
No response
The text was updated successfully, but these errors were encountered: