You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And the UI should support, just like OpenAI and Grok, the ability to switch the chat from voice to text and text to voice right on the spot.
So what this really needs to do is:
Have full realtime audio support with noise cancelling on each platform. (i.e. sort of like the realtime_audio library but expaned to all platforms and fixed)
Have visualization while you're talking.
Have playback audio support with visualization. (using flutter_sound or whatever, or your own)
Have transcription from server of STT. (this is just adding a message to the screen)
Have the transcript of what the AI is saying be injectable from your provider. (this is just adding the message to the screen)
Make sure that you can interrupt properly. (i.e. I should be able to talk over the AI and you should notice that I did, and kill the AI and start streaming me nicely back to the server)
Please expand this to do full voice to voice.
And the UI should support, just like OpenAI and Grok, the ability to switch the chat from voice to text and text to voice right on the spot.
So what this really needs to do is:
You could go further and:
By doing this you could use any text chat AI, and you'd get realtime, with most of the realtime work done on the front end.
But these "further" items are optional. We really need support for realtime voice to voice.