Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use compressed audio directly from memory #35

Open
jmercouris opened this issue Sep 27, 2016 · 9 comments
Open

Use compressed audio directly from memory #35

jmercouris opened this issue Sep 27, 2016 · 9 comments

Comments

@jmercouris
Copy link

It is currently possible to decode audio files to PCM. It would be additionally nice to be able to directly decode mp3 data frames to PCM.

@sampsyo sampsyo changed the title Allow for audio read to directly decode Data frames Use compressed audio directly from memory Sep 27, 2016
@sampsyo
Copy link
Member

sampsyo commented Sep 27, 2016

This is related to #34, where the desire is to load audio data from the network. This would be a good idea, if we can retrofit a direct-from-memory interface onto all of our backends.

There's one complicating question: How do we know which backend to select? When reading from disk, it's easy: we just try reading the file with each, and if one backend fails (e.g., because the format is unsupported), we give up and "rewind" by reading the file from the beginning using a different backend. That will be harder to do in a streaming setting, where we don't have the luxury to travel back in time to read again from the beginning.

I also suspect some backends will want a MIME type for streamed data—i.e., they currently use the filename extension as a signal for the data type.

@jmercouris
Copy link
Author

jmercouris commented Sep 28, 2016

As a solution for any backends that require MIME types from a file we could "make" a file temporarily with a specific format. We would then keep appending to the file and having it decode it into a buffer which would then be returned to the user.

Addtionally helpful would be the ability to supply an argument of the expected format.

I've actually implemented a piece of code that works exactly on this principle, and I could share it with you, right now it has some small playback issues due to GIL, but if this path is something you'd be interested in, I can share it.

@jksinton
Copy link
Collaborator

I'm also interested in feeding audioread an MP3 read from memory instead of a filename. I have a darkice audio stream that I would like to follow, open the stream in chunks, and analyze using librosa.

Initially, I was thinking of configuring darkice to continuously write to an MP3 and following it with seek. But now, you have me thinking of reading the stream over the network.

As a test, I've successfully passed the MP3 binary to the stdin of an ffmpeg subprocess:

a = open("archive_active.mp3","r")
p = subprocess.Popen(['ffmpeg','-i','-','out.wav'],stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.STDOUT)
print p.communicate(input=a.read())[0].decode()

But I'm not sure where to begin on editing audioread.

@sampsyo
Copy link
Member

sampsyo commented Oct 21, 2016

Cool! If you're interested, maybe the right place to start would be with the ffdec backend. You could try hacking it in at first in the most non-elegant way possible—just port what you've done there to ffdec.py—and then we can sort out how to make the interface configurable.

@jksinton
Copy link
Collaborator

Okay, I forked audioread and made my own branch: https://github.com/jksinton/audioread/tree/compressed-audio

It's not pretty but ffdec now accepts compressed audio as an argument. One of the challenges is that Popen.communicate returns a tuple with str objects for both the stdout and stderr instead of file objects. The QueueReaderThread and _get_info functions rely on self.proc.stdou/stderr to be file objects. To solve this, I convert the str objects to file objects with StringIO. This cannot be the most efficient solution.

@sampsyo
Copy link
Member

sampsyo commented Oct 23, 2016

Cool; looks good! Maybe we should pull this into a branch in the central repository so everybody can work on it together.

I don't quite see why the new version needs use Popen.communicate. Does the old approach, which only reads one block at a time, not work? It would be great to preserve that "incremental" property—maybe we need a second thread to send data into the pipe as it becomes ready.

@jksinton
Copy link
Collaborator

Adrian,

Sure, I'll submit a pull request once we create a branch for this feature. I don't think I have the privileges to create a branch on the central repository.

I was using Popen.communicate because the Python subprocess documentation gave this warning:

Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

See the warning right above Popen.stdin: https://docs.python.org/2/library/subprocess.html#subprocess.Popen.stdin

I might be able to prepare a QueueWriterThread function to write directly to Popen.stdin, similar to the QueueReaderThread.

-James

@jksinton
Copy link
Collaborator

I've pushed a new version to the compressed-audio branch on my fork. It replaces Popen.communicate with a WriterThread function and works with the QueueReaderThread function that you already had implemented. In other words, the stdout produced as a result of writing to proc.stdin is handled with the QueueReaderThread function without Popen.communicate.

I tried writing directly to self.proc.stdin.write(audio.read()) without a threaded write, e.g., WriterThread, but the process would hang. Not sure why when the WriterThread is essentially this:

def run(self):
        self.fh.write(self.audio.read())
        self.fh.close()

@sampsyo
Copy link
Member

sampsyo commented Oct 26, 2016

That sounds great. That "hanging" behavior, in fact, is exactly why we need a separate thread for reading (and now for writing): the OS call to read or write from a file descriptor can block. We need to concurrently block in the OS call while letting the rest of the application proceed. And, with the new extension, we will need to concurrently send data into the subprocess while reading data out of the same subprocess: so, two threads.

I will make you a collaborator on this repository—that should let you create a branch here whenever you're ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants