-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support Audio IO in Chatbot #2768
Comments
+1 |
+1 |
Hi @rishikeshF you can already add audio files to your |
Hello, I'm building a full ai assistant bot on top of gradio (wwd, s2t, information retrial), I have to control many interfaces and components rendering or given outputs. There's no good reactive and fast response on any component, except the chatbot component, that's why I love it. It's very powerful and straightforward usage. But this issue is still not finished I think, we need the capability of recording audio the same way we write a text and press enter to interact now: record with audio, stop recording, press enter, deal with the possible post-process (wav2vec2, whisper), use LLMs and neural search. I think this is the correct implementation. And the field tendency is to add more and more multimodality, write, record, and send an image, as of today's 'XXX GPTs'. This implementation would certainly be a nice addition to the library. |
Hi @jpmcarrilho just to confirm, you'd like to be able to record audio, do some preprocessing and display it in a Chatbot? This is already possible by using the |
Yes, sorry for the bad english, I want to be able to record the gr.Audio and click in 'stop recording', press enter and send it directly to the chat, or sendit after the click in stop. I saw the docs but they only cover files upload. Can you provide me a reference of the Audio component being sent to the chat bot as soon as i stop recording? |
No worries @jpmcarrilho. Yes this is currently possible in Gradio, you could do something like this: import os
def get_chatbot_response(x):
os.rename(x, x + '.wav')
return [((x + '.wav',), "Your voice sounds nice!")]
with gr.Blocks() as demo:
chatbot = gr.Chatbot()
mic = gr.Audio(source="microphone", type="filepath")
mic.change(get_chatbot_response, mic, chatbot)
demo.launch() (The Whicih produces this: |
Well, thank you, that is exactly what I needed. |
@abidlabs is there an option to auto-play chatbot responses? |
@rc-eddy currently, there isn't. Although I'll look into adding that when I refactor the chatbot. |
@abidlabs @dawoodkhan82 Have you had a chance to integrate the auto-play feature for the audio in the chatbot responses? If not, is there a workaround that you can recommend? |
Is your feature request related to a problem? Please describe.
Given that Chatbot now supports images after v3.12.0, I think we should double down on the multi-modality and add Audio IO to the chatbot interface. It will be a great interface for Voice Assistant demos.
Describe the solution you'd like
Gradio already all the building blocks for this feature and just need to put it together. I think something like the WeChat interface will be nice. Inside the chatbot interface, we replace each text with a bar, which is the Audio output and plays the audio when clicked. Then we can put the audio input button somewhere prominent for easy input.
The text was updated successfully, but these errors were encountered: