torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive #2792

pooya-mohammadi · 2022-10-24T19:06:25Z

🐛 Describe the bug

I have a websocket that receives chunks of data in a byte format. The browser encodes the data in audio/webm format. The code is like the following:

@app.websocket("/listen")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            data = await websocket.receive_bytes()
            with open('audio.wav', mode="wb") as f:
                f.write(data)
    except Exception as e:
        raise Exception(f'Could not process audio: {e}')
    finally:
        await websocket.close()

Manually writing the data to audio.wav and then reading the file using the following code works fine with no errors:

array, sr = torchaudio.load("audio.wav")

However, reading the file as a file object does not work:

with open("audio.wav", mode="rb") as f:
    torchaudio.load(f)

It raises the following error:

Exception: Could not process audio: Failed to open the input "<_io.BufferedReader name='audio.wav'>" (Invalid data found when processing input).
INFO:     connection closed

PS: Creating BytesIO from the data and passing it to the torchaudio.load results in error the same as the above.

Versions

Versions of relevant libraries:
[pip3] numpy==1.23.4
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[conda] numpy 1.23.4 pypi_0 pypi

OS

Ubuntu: 22.04
torchaudio.backend: "sox_io"

PS

I tested the same process on a webm file which was converted from a wav file, and the result was the same:

torchaudio.load can read the file from hard drive.
torchaudio.load cannot read bytesio or _io.BufferedReader

The text was updated successfully, but these errors were encountered:

mthrok · 2022-10-25T00:36:17Z

Hi @pooya-mohammadi

This issue seems to depend on the data you are handling. Is it possible to share some sample data, which do not include any PII nor copyright issue? A complete silence is fine.

pooya-mohammadi · 2022-10-25T05:41:53Z

Hi @mthrok
audio.zip
It's a simple file(less than 1 second) that contains a little noise to make sure the mic were functioning correctly.

mthrok · 2022-10-25T07:41:26Z

Hi @pooya-mohammadi

The audio you shared has wav extension but, in fact, it is WebM format.

with open("audio.wav", "rb") as f:
    print(f.read(50)[30:])

prints the following

b'\x84webmB\x87\x81\x02B\x85\x81\x02\x18S\x80g\x01\xff\xff'

and ffprove audio.wav reports;

Input #0, matroska,webm, from 'audio.wav':
  Metadata:
    encoder         : QTmuxingAppLibWebM-0.0.1
  Duration: N/A, start: -0.001000, bitrate: N/A
  Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)

torchaudio.load first attempts to read it with libsox, but it fails as WebM is not supported, and it re-tries with FFmpeg only when the source is file path. It cannot retry when the input is file-like object, as seek method is not always available.

To handle WebM, you can use torchaudio.io.StreamReader, and it works with both file input and file-like object input and it can do iterative reading as well.

# loading from path and  read the entire audio in one-go
s = torchaudio.io.StreamReader(path)
s.add_basic_audio_stream(-1)
s.process_all_packets()
waveform, = s.pop_chunks()

# load from file-like object and read audio chunk-by-chunk
s = torchaudio.io.StreamReader(f)
s.add_basic_audio_stream(chunk_size)
for chunk, in s.stream():
    # process waveform

For the detailed usage, please checkout tutorials like

pooya-mohammadi · 2022-10-25T10:00:27Z

@mthrok
Thanks for the detailed answer. It works fine. I could solve the issue like the following code snippet. However, I created a stream over the data which is generated from another streaming tool. I think this is not the most optimized way. Are you planning to add the support to torchaudio.load or something with similar functionality to read the whole file-obj data which is webm format in the future?

@app.websocket("/listen")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            data = await websocket.receive_bytes()
            f = BytesIO(data)
            s = torchaudio.io.StreamReader(f)
            s.add_basic_audio_stream(1000)
            tensor = torch.concat([chunk[0] for chunk in s.stream()])
            print(tensor.shape)
    except Exception as e:
        raise Exception(f'Could not process audio: {e}')
    finally:
        await websocket.close()

mthrok · 2022-10-25T12:37:55Z

I think you could do chunk-by-chunk decoding, which is more efficient, but not sure if this is what you want, as I do not know what application you are building.

To do chunk-by-chunk decoding, you can wrap the socket object into a synchronous file-like object.

class Wrapper:
    def __init__(self, socket):
        self.socket = socket
        self.buffer = b''

    def read(self, n):
        while len(self.buffer) < n:
            new_data = await self.socket.receive_bytes()
            if not new_data:
                break
            self.buffer += new_data
        data, self.buffer = self.buffer[:n], self.buffer[n:]
        return data

Then passing it to StreamReader and let StreamReader pull the data.

try:
    wrapper = Wrapper(websocket)
    s = torchaudio.io.StreamReader(wrapper)
    for chunk in s.stream():
        print(chunk.shape)
except ...

mthrok · 2022-10-25T12:48:35Z

Note you can read in one-go with file-like object input. The src argument and how decoding is done is independent.

s = torchaudio.io.StreamReader(fileobj)
s.add_basic_audio_stream(-1)
s.process_all_packets()
waveform, = s.pop_chunks()

mthrok · 2022-10-25T12:49:00Z

I am going to close the issue, as this is not a bug with torchaudio.

pooya-mohammadi changed the title ~~torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the from the file~~ torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive Oct 25, 2022

mthrok closed this as completed Oct 25, 2022

pooya-mohammadi closed this as completed Oct 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive #2792

torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive #2792

pooya-mohammadi commented Oct 24, 2022 •

edited by mthrok

mthrok commented Oct 25, 2022

pooya-mohammadi commented Oct 25, 2022

mthrok commented Oct 25, 2022 •

edited

pooya-mohammadi commented Oct 25, 2022 •

edited

mthrok commented Oct 25, 2022

mthrok commented Oct 25, 2022

mthrok commented Oct 25, 2022

torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive #2792

torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive #2792

Comments

pooya-mohammadi commented Oct 24, 2022 • edited by mthrok

🐛 Describe the bug

Versions

OS

PS

mthrok commented Oct 25, 2022

pooya-mohammadi commented Oct 25, 2022

mthrok commented Oct 25, 2022 • edited

pooya-mohammadi commented Oct 25, 2022 • edited

mthrok commented Oct 25, 2022

mthrok commented Oct 25, 2022

mthrok commented Oct 25, 2022

pooya-mohammadi commented Oct 24, 2022 •

edited by mthrok

mthrok commented Oct 25, 2022 •

edited

pooya-mohammadi commented Oct 25, 2022 •

edited