New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for large audio files ( > 2GB) #135

Open
jiaaro opened this Issue Apr 20, 2016 · 6 comments

Comments

Projects
None yet
5 participants
@jiaaro
Owner

jiaaro commented Apr 20, 2016

Until this is built, some users will be able to get away with breaking the audio into chunks like in my comment on #124

Try implementing the StreamingAudioSegment class - which should be an API compatible implementation of AudioSegment (which may use AudioSegment internally?) which provides the same interface/methods, but does not load the complete audio into RAM.
If there are significant roadblocks there, perhaps just utilities which do individual memory-intensive operations (not as nice a solution).

Outline of an approach:

The new VeryMemoryConsciousAudioSegment (still workshopping names) could do the audio conversions up front (like they are now) to standard wave data on disk in temp files. All operations on these instances would just pile up in a list until the moment when the actual audio data is needed (like an export, or retrieving info like duration, or loudness).

When the audio data is needed, all pending operations would be applied and the result stored in a new temp file on disk in order to avoid reapplying the operations over and over.

As I think more about this, it seems like this has some downsides (much more disk intensive, harder to do operations that inspect the audio data like getting loudness). I'm becoming more convinced that the current in-memory AudioSegment will need to stick around for some uses even if we get to a completely feature complete Streaming/On-disk implementation.


note: I was originally going to commandeer #124, then #51, and finally settled on adding a new ticket.

Also related: #101

@nickmetal

This comment has been minimized.

Show comment
Hide comment
@nickmetal

nickmetal May 1, 2017

Hi @jiaaro! Is any news about StreamingAudioSegment? :)
I think it will be very in-demand

nickmetal commented May 1, 2017

Hi @jiaaro! Is any news about StreamingAudioSegment? :)
I think it will be very in-demand

@jiaaro

This comment has been minimized.

Show comment
Hide comment
@jiaaro

jiaaro May 2, 2017

Owner

@nickmetal so far no news - it appears to be a relatively big project and as the transition to 64 bit continues and machines have more RAM, it seems the need for it is slowly going away. For example, I recently loaded an audio segment with ~8 hours of "CD quality" audio. It used a lot of RAM, sure, but it worked.

Can you comment further on what you would use it for?

Owner

jiaaro commented May 2, 2017

@nickmetal so far no news - it appears to be a relatively big project and as the transition to 64 bit continues and machines have more RAM, it seems the need for it is slowly going away. For example, I recently loaded an audio segment with ~8 hours of "CD quality" audio. It used a lot of RAM, sure, but it worked.

Can you comment further on what you would use it for?

@nickmetal

This comment has been minimized.

Show comment
Hide comment
@nickmetal

nickmetal May 3, 2017

@jiaaro I want to using this for converting audio from webm to mp3 for my own needs. But my server have a limitation of 512MB RAM.

Maybe I have a decision. I should test much better than it is now. It working for a converting from webm to mp3 now with using not more than ~25 MB RAM. I'm going to test on this weekend this stuff. I will make Pull Request if it comes positive.

nickmetal commented May 3, 2017

@jiaaro I want to using this for converting audio from webm to mp3 for my own needs. But my server have a limitation of 512MB RAM.

Maybe I have a decision. I should test much better than it is now. It working for a converting from webm to mp3 now with using not more than ~25 MB RAM. I'm going to test on this weekend this stuff. I will make Pull Request if it comes positive.

@mission

This comment has been minimized.

Show comment
Hide comment
@mission

mission Jul 31, 2017

As far as the memory error, switching to a 64bit version of python fixed it. As 32bit has a 4gb limit.

Hopefully this helps :)

mission commented Jul 31, 2017

As far as the memory error, switching to a 64bit version of python fixed it. As 32bit has a 4gb limit.

Hopefully this helps :)

@kamisori

This comment has been minimized.

Show comment
Hide comment
@kamisori

kamisori Nov 14, 2017

@jiaaro try longer audiofiles. In my case pydub returned sweet nothings after 9 hours and 26 minutes.
At one point the RAM usage spiked at 15.9GB but then went down steadily while splitting the file.
Perhaps the whole file wasn't fully loaded into memory.

kamisori commented Nov 14, 2017

@jiaaro try longer audiofiles. In my case pydub returned sweet nothings after 9 hours and 26 minutes.
At one point the RAM usage spiked at 15.9GB but then went down steadily while splitting the file.
Perhaps the whole file wasn't fully loaded into memory.

gpchelkin added a commit to gpchelkin/scdlbot that referenced this issue Nov 18, 2017

@Kazanz

This comment has been minimized.

Show comment
Hide comment
@Kazanz

Kazanz May 3, 2018

Any news on this issue?

Kazanz commented May 3, 2018

Any news on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment