Live audio streaming output #5077

aliabid94 · 2023-08-03T06:46:04Z

This PR allows users to stream audio out. See demo/streaming_audio_out for an example that streams out pieces of an audio file second by second.

Fixes: #5110

vercel · 2023-08-03T06:46:10Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
gradio	✅ Ready (Inspect)	Visit Preview	Aug 8, 2023 9:56pm

gradio-pr-bot · 2023-08-03T06:46:40Z

🦄 change detected

This Pull Request includes changes to the following packages.

Package	Version
`@gradio/upload`	`minor`
`gradio`	`minor`

Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Live audio streaming output

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

Maintainers can change the version label to modify the version bump.
If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

gradio-pr-bot · 2023-08-03T06:46:50Z

🎉 The demo notebooks match the run.py files! 🎉

…nto stream_audio

gradio-pr-bot · 2023-08-03T06:48:59Z

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-5077-all-demos

You can install the changes in this PR by running:

pip install https://gradio-builds.s3.amazonaws.com/ab0b17df50c938c4fa6ff0608805406025087774/gradio-3.39.0-py3-none-any.whl

abidlabs · 2023-08-03T17:18:42Z

Tested this out and it works great @aliabid94 even with multiple outputs! However, I'm concerned about the fact that it uses a completely different mechanism for streaming, as compared to regular generator function including a separate /stream route.

Is this actually necessary? I don't have a concrete alternative right now, but among other things, this breaks the client for any route with streaming. You can try by running:

import gradio as gr
import numpy as np
from pydub import AudioSegment
import time

def stream_audio(lag):
    audio_file = 'test.mp3'  # Your audio file path
    audio = AudioSegment.from_mp3(audio_file)
    chunk_length = 1000
    chunks = []
    while len(audio) > chunk_length:
        chunks.append(audio[:chunk_length])
        audio = audio[chunk_length:]
    if len(audio):  # Ensure we don't end up with an empty chunk
        chunks.append(audio)

    def iter_chunks():  
        for chunk in chunks:
            file_like_object = chunk.export(format="mp3")
            data = file_like_object.read()
            time.sleep(lag)
            yield data

    return iter_chunks(), "fixed response"

demo = gr.Interface(
    stream_audio,
    gr.Slider(0, 3, 0, label="lag", info="Duration before generating next second of audio. >1s to cause lag."),
    [gr.Audio(autoplay=True), gr.Textbox()]
)

if __name__ == "__main__":
    _, url, _ = demo.launch()

and then:

from gradio_client import Client

client = Client(url)
result = client.predict(
				0,	# int | float (numeric value between 0 and 3) in 'lag' Slider component
				api_name="/predict"
)
print(result)

abidlabs · 2023-08-03T17:21:55Z

Also the user-facing API with having to return the generator is a little different than how Gradio users are used to generating/streaming. I would have expected something like this as the API (directly returning the generator, plus setting streaming=True in the Audio component):

import gradio as gr
import numpy as np
from pydub import AudioSegment
import time

def stream_audio(lag):
   ...
        for chunk in chunks:
            file_like_object = chunk.export(format="mp3")
            data = file_like_object.read()
            time.sleep(lag)
            yield data, "fixed response"

    

demo = gr.Interface(
    stream_audio,
    gr.Slider(0, 3, 0, label="lag", info="Duration before generating next second of audio. >1s to cause lag."),
    [gr.Audio(autoplay=True, streaming=True), gr.Textbox()]
)

if __name__ == "__main__":
    _, url, _ = demo.launch()

abidlabs · 2023-08-03T17:31:32Z

Ok so here's an idea that doesn't fix everything above but I think would allow you to use the above developer API.

Steps:

Developer writes a regular generator function, something like this:

def stream_audio(lag):
   ...
        for chunk in chunks:
            file_like_object = chunk.export(format="mp3")
            data = file_like_object.read()
            time.sleep(lag)
            yield data, "fixed response"

and sets streaming=True in the Audio() output component like this:

demo = gr.Interface(
    stream_audio,
    gr.Slider(0, 3, 0, label="lag", info="Duration before generating next second of audio. >1s to cause lag."),
    [gr.Audio(autoplay=True, streaming=True), gr.Textbox()]
)

Gradio sees if any of the outputs have set streaming=True and if so, doesn't evaluate the generator function, but instead pass it into a FastAPI StreamingResponse
In the Client, we can check to see if an endpoint has any outputs that stream, and if so, we make them invalid endpoints so that they don't show up in the view API page

aliabid94 · 2023-08-03T17:42:43Z

Gradio sees if any of the outputs have set streaming=True and if so, doesn't evaluate the generator function, but instead pass it into a FastAPI StreamingResponse

StreamingResponse requires a generator that only yields bytes. We could "wrap" the generator with another generator that tosses out all other outputs. However this will obviously ignore the intended user behaviour of setting the other outputs. There's no way we would be able to get access to the other outputs because we don't have access to the outputs as they are being yielded - only FastAPI does, so we can't send updates or anything with those outputs.

abidlabs · 2023-08-03T20:41:00Z

Suppose you you wanted to create a demo that streamed music and also generated lyrics in realtime for the streaming music. That would not be possible with this API, correct?

abidlabs · 2023-08-03T20:55:41Z

I think we need to do something like this:

When a user passes in a regular generator function, and one of the output components has streaming=True, then a pending_stream is created for that component. Think of the pending_stream as just a regular list.
On every iteration of the generator function, we take the output corresponding corresponding to that component and append it to pending_stream. Once the generator function is complete or errors out, we append a special StopIteration token to pending_stream

In the /stream route, we define a generator that looks like this:

def stream_until_complete():
   chunks = pending_stream
   chunk = None
   index = 0
   while not chunk == StopIteration:
      yield chunk
      if index >= len(chunks):
         yield None
      else:
          chunk = chunks[index]
          index += 1

(code may need to be tweaked but this is the general idea)

Then you pass in stream_until_complete into FastAPI's StreamingResponse

The basic idea is that instead of directly passing in our generator function to StreamingResponse (which would mean we lose the other outputs as you said), here we use our generator function to populate a list (potentially even multiple lists if there are multiple streaming output components), and have a second generator that reads from that list which is passed into StreamingResponse.

The benefits of this approach I believe would be to (1) allow developers to maintain an API they are familiar with (2) allow for use cases where you have multiple outputs streaming together

aliabid94 · 2023-08-08T00:05:19Z

Ok now I accept direct yielding from the function, see demo/stream_audio_out/ for an example. Ready for re-review @abidlabs

…nto stream_audio

demo/stream_audio_out/run.py

abidlabs · 2023-08-08T16:57:12Z

guides/02_building-interfaces/02_reactive-interfaces.md

+$code_stream_frames
+
+Streaming can also be done in an output component. A `gr.Audio(streaming=True)` output component can take a stream of audio data yielded piece-wise by a generator function and combines them into a single audio file.


Let's put the stream_audio_out example demo here (ideally after simplifying it a bit)

.changeset/famous-rice-taste.md

abidlabs · 2023-08-08T18:59:50Z

Here's a simplified demo you can use @aliabid94:

import gradio as gr
from pydub import AudioSegment
import time


def stream_audio(audio_file, lag):
    audio = AudioSegment.from_mp3(audio_file)
    i = 0
    chunk_size = 1000
    
    while chunk_size*i < len(audio):
        chunk = audio[chunk_size*i:chunk_size*(i+1)]
        i += 1
        if chunk:
            file = f"/tmp/{i}.mp3"
            chunk.export(file, format="mp3")            
            yield file, i
        
demo = gr.Interface(
    fn=stream_audio,
    inputs=[
        gr.Audio(type="filepath", label="Audio file to stream"),
        gr.Slider(0, 3, 0,
            label="lag",
            info="Duration before generating next second of audio. Set >1s to cause lag.",
        ),
    ],
    outputs=[
        gr.Audio(
            autoplay=True, 
            streaming=True), # needed to stream output audio
        gr.Textbox()
    ],
)

if __name__ == "__main__":
    demo.queue().launch()

abidlabs · 2023-08-08T19:11:54Z

Noticing some small issues:

When trying to stream two output audio files at the same time, this doesn't work (raises a mysterious KeyError):

Here's an adaption of the code above:

import gradio as gr
from pydub import AudioSegment
import time


def stream_audio(audio_file, lag):
    audio = AudioSegment.from_mp3(audio_file)
    i = 0
    chunk_size = 1000
    
    while chunk_size*i < len(audio):
        chunk = audio[chunk_size*i:chunk_size*(i+1)]
        i += 1
        if chunk:
            file = f"/tmp/{i}.mp3"
            chunk.export(file, format="mp3")            
            yield file, file
        
demo = gr.Interface(
    fn=stream_audio,
    inputs=[
        gr.Audio(type="filepath", label="Audio file to stream"),
        gr.Slider(0, 3, 0,
            label="lag",
            info="Duration before generating next second of audio. Set >1s to cause lag.",
        ),
    ],
    outputs=[
        gr.Audio(
            autoplay=True, 
            streaming=True), # needed to stream output audio
        gr.Audio(
            autoplay=True, 
            streaming=True), # needed to stream output audio
    ],
)

if __name__ == "__main__":
    demo.queue().launch()

There's a very brief but slightly jarring discontinuity in between the chunks when livestreaming the output audio. Not clear to me if its an issue with the chunking logic or with the streaming logic. This the audio file I tried: https://dl.sndup.net/tckv/test.mp3

aliabid94 · 2023-08-08T21:18:51Z

When trying to stream two output audio files at the same time, this doesn't work (raises a mysterious KeyError):

Fixed.

There's a very brief but slightly jarring discontinuity in between the chunks when livestreaming the output audio

I think it's because we were streaming 1 second chunks, which was too frequent. Increased to 3 second chunks in the demo and the breaks are much better imo.

abidlabs · 2023-08-08T21:49:36Z

I think it's because we were streaming 1 second chunks, which was too frequent. Increased to 3 second chunks in the demo and the breaks are much better imo.

I think I still hear them in the 3 second, but its very minor so not a blocker imo.

abidlabs

Awesome PR @aliabid94!

gradio-pr-bot · 2023-08-08T22:08:52Z

🎉 Chromatic build completed!

There are 0 visual changes to review.
There are 0 failed tests to fix.

changes

3461d54

aliabid94 requested a review from freddyaboulton August 3, 2023 06:46

add changeset

437e9a9

vercel bot deployed to Preview August 3, 2023 06:47 View deployment

aliabid94 added 2 commits August 3, 2023 01:48

changes

bffa327

Merge branch 'stream_audio' of https://github.com/gradio-app/gradio i…

6856edd

…nto stream_audio

aliabid94 requested a review from abidlabs August 3, 2023 06:49

vercel bot deployed to Preview August 3, 2023 06:49 View deployment

sanchit-gandhi mentioned this pull request Aug 4, 2023

[MusicGen] Add streamer to generate huggingface/transformers#25320

Merged

changes

4967eba

vercel bot deployed to Preview August 8, 2023 00:05 View deployment

changes

5830ece

vercel bot deployed to Preview August 8, 2023 00:12 View deployment

changes

725ffaa

vercel bot deployed to Preview August 8, 2023 00:18 View deployment

changes

7e428a5

vercel bot deployed to Preview August 8, 2023 00:21 View deployment

changes

c7ddd3e

vercel bot deployed to Preview August 8, 2023 01:15 View deployment

vercel bot deployed to Preview August 8, 2023 01:32 View deployment

aliabid94 added 2 commits August 7, 2023 19:40

changes

0fbe721

Merge branch 'stream_audio' of https://github.com/gradio-app/gradio i…

af5ac91

…nto stream_audio

vercel bot deployed to Preview August 8, 2023 02:44 View deployment

abidlabs reviewed Aug 8, 2023

View reviewed changes

demo/stream_audio_out/run.py Outdated Show resolved Hide resolved

abidlabs reviewed Aug 8, 2023

View reviewed changes

.changeset/famous-rice-taste.md Show resolved Hide resolved

changes

9dcaec6

vercel bot deployed to Preview August 8, 2023 20:59 View deployment

changes

11f48b6

vercel bot deployed to Preview August 8, 2023 21:14 View deployment

changes

de3cff9

vercel bot deployed to Preview August 8, 2023 21:18 View deployment

abidlabs approved these changes Aug 8, 2023

View reviewed changes

changes

0ee623f

vercel bot deployed to Preview August 8, 2023 21:53 View deployment

changes

6ebdd64

vercel bot deployed to Preview August 8, 2023 21:56 View deployment

aliabid94 merged commit 667875b into main Aug 8, 2023
13 checks passed

aliabid94 deleted the stream_audio branch August 8, 2023 22:08

pngwn mentioned this pull request Aug 9, 2023

chore: update versions #5038

Merged

snarb mentioned this pull request Aug 10, 2023

Audio streaming still not working properly with latest fixes #5168

Closed

1 task

This was referenced Aug 11, 2023

Feature request: stream images from video in realtime #5187

Open

Streaming video update #1637

Closed

Enable streaming audio in python client #5248

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Live audio streaming output #5077

Live audio streaming output #5077

aliabid94 commented Aug 3, 2023 •

edited by abidlabs

vercel bot commented Aug 3, 2023 •

edited

gradio-pr-bot commented Aug 3, 2023 •

edited

Something isn't right?

gradio-pr-bot commented Aug 3, 2023 •

edited

gradio-pr-bot commented Aug 3, 2023 •

edited

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023

aliabid94 commented Aug 3, 2023

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023 •

edited

aliabid94 commented Aug 8, 2023

abidlabs Aug 8, 2023

abidlabs commented Aug 8, 2023 •

edited

abidlabs commented Aug 8, 2023 •

edited

aliabid94 commented Aug 8, 2023

abidlabs commented Aug 8, 2023

abidlabs left a comment

gradio-pr-bot commented Aug 8, 2023 •

edited

		$code_stream_frames

		Streaming can also be done in an output component. A `gr.Audio(streaming=True)` output component can take a stream of audio data yielded piece-wise by a generator function and combines them into a single audio file.

Live audio streaming output #5077

Live audio streaming output #5077

Conversation

aliabid94 commented Aug 3, 2023 • edited by abidlabs

vercel bot commented Aug 3, 2023 • edited

gradio-pr-bot commented Aug 3, 2023 • edited

🦄 change detected

This Pull Request includes changes to the following packages.

With the following changelog entry.

Something isn't right?

gradio-pr-bot commented Aug 3, 2023 • edited

gradio-pr-bot commented Aug 3, 2023 • edited

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023

aliabid94 commented Aug 3, 2023

abidlabs commented Aug 3, 2023

abidlabs commented Aug 3, 2023 • edited

aliabid94 commented Aug 8, 2023

abidlabs Aug 8, 2023

Choose a reason for hiding this comment

abidlabs commented Aug 8, 2023 • edited

abidlabs commented Aug 8, 2023 • edited

aliabid94 commented Aug 8, 2023

abidlabs commented Aug 8, 2023

abidlabs left a comment

Choose a reason for hiding this comment

gradio-pr-bot commented Aug 8, 2023 • edited

aliabid94 commented Aug 3, 2023 •

edited by abidlabs

vercel bot commented Aug 3, 2023 •

edited

gradio-pr-bot commented Aug 3, 2023 •

edited

gradio-pr-bot commented Aug 3, 2023 •

edited

gradio-pr-bot commented Aug 3, 2023 •

edited

abidlabs commented Aug 3, 2023 •

edited

abidlabs commented Aug 8, 2023 •

edited

abidlabs commented Aug 8, 2023 •

edited

gradio-pr-bot commented Aug 8, 2023 •

edited