Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove AudioContext.decodeAudioData() from the web #1305

Closed
juj opened this issue Sep 1, 2017 · 10 comments

Comments

@juj
Copy link

@juj juj commented Sep 1, 2017

Reading articles around the web, I find that Web Audio API is in a horrible state of affairs at the moment with respect to the AudioContext.decodeAudioData function, since a large number of articles such as

very casually use decodeAudioData, without any remarks in sight that this function goes against everything that proper best practices on audio would do. The decodeAudioData() function is specced to decode the whole file up front, and virtually no application in existence wants to do this, since it will waste hundreds of megs of memory and take several seconds of CPU power, and battery on mobile. The large number of above type of articles on the web clearly shows that authors do not generally realize the issues with decodeAudioData(), and as a result, the web is in a situation where the "best" practices articles, even from respectable sites like above, teach how to solve audio tasks by consuming away potentially insane amounts of RAM and CPU.

The reason why these articles all so prominently mention decodeAudioData(), is that currently decompressing is required to two most common tasks: a) loop (possibly with custom loop points), and b) animate pitch changes. (issue #938) There is no technical reason why these effects require decoding the input audio files fully up front, but due to an accidental API limitation, that is the only way to get to those features.

The function AudioContext.decodeAudioData() really needs to go. There should exist easy ways to achieve everything that these sites currently use decodeAudioData() for, so that there would be very little reason to use it for common tasks, and web sites need to stop referring to it as "best practices" to do looping or something else.

I'd recommend that decodeAudioData was replaced with a cursor-based API where one would create a decoder object, and seek to start decoding from a desired location, and then tell the decoder to advance N samples at a time. In this process, the API flow and the Web Audio spec documentation could make it better documented that there are CPU and memory usage costs in doing these operations, and it's very far to casually go off decoding all audio one uses in an application.

We had chatted with @padenot face to face in June about this, and issue #938, and the prospect of AudioWorklets+SharedArrayBuffer+WebAssembly to come and save the day for issue #938. For WebAssembly-based applications that will be great, but I presume even after AudioWorklets land, it will not make a dent to change the contents of above websites, which will still keep referring to AudioContext.decodeAudioData() being the simple way to do looping audio (for example), being oblivious that it will potentially blow up memory usage 100x ([1]) compared to the disk size, and take up dozens of seconds on mobile.

Web Audio API graph is stellar at giving great flexibility to doing all sorts of effects and processing on uncompressed audio in a simple manner, I think from users' point of view, operating on compressed audio should be simpler than uncompressed audio, so that it would naturally nudge tutorial sites and users towards best practices of keeping audio compressed in the first place. The carefree nature that sites currently treat decodeAudioData() is sad, that should not be allowed to continue.

@juj

This comment has been minimized.

Copy link
Author

@juj juj commented Sep 1, 2017

At http://clb.demon.fi/audio_test_suite/ (source files at https://github.com/juj/audio_test_suite), there is now a repository of test clips that poke Web Audio API from this angle.

@padenot

This comment has been minimized.

Copy link
Member

@padenot padenot commented Sep 1, 2017

I partially agree with this, that's why having proper and symmetric encoding/decoding API, stream based, with in-memory-compressed audio assets is #1 on my feature list for v.next.

Having the data in memory is reasonable for some use cases, though.

@mr21

This comment has been minimized.

Copy link

@mr21 mr21 commented Sep 2, 2017

Is it impossible to give two optional arguments to ctx.decodeAudioData(), an offset and duration, like the ABSN.start()? So there will be no breaking change and it will be possible to decode a specific part of a file.

@notthetup

This comment has been minimized.

Copy link
Contributor

@notthetup notthetup commented Sep 3, 2017

That's a great idea, but I think for use-cases where you're decoding small file (for sound effects, flairs, etc) it's important to ensure that the new API supports a simple variant to completely decode the file. Having to dabble with cursors and step sizes when all you want is a small (1 second) file decoded would be a step backwards in usability. So I would encourage the group to consider both use cases when designing the API.

@joeberkovitz

This comment has been minimized.

Copy link
Contributor

@joeberkovitz joeberkovitz commented Sep 7, 2017

We agree about the importance of this issue and #335 and #337 already capture our intent to support partial and streamed decoding.

@juj

This comment has been minimized.

Copy link
Author

@juj juj commented Sep 13, 2017

Thanks for cross-referencing the links @joeberkovitz . Though those items are about adding new APIs, and not about removing decodeAudioData(). Even if support for partial and streamed decoding is added, and #938 fixed, the root cause still remains that the web is filled with sites that promote bad practices without the authors even realizing.

It does not seem right that one line in a JavaScript application can consume twice or three times the memory than the rest of the browser combined, while users do not know this is happening. I think it would be a wrong call to ignore what is happening here without taking any action, and I think this bug should not have gotten closed on the above grounds.

My proposal still would be to deprecate decodeAudioData(). This does not mean that decoding a whole audio file should be made "impossibly hard", but that the API to do so should look different. For example, if there were a cursor, or an offset based API, one could initialize the cursor to the beginning, and ask to decode all the way to the end in one go.

In particular, decoding should support avoiding sample rate and sample type conversion. The spec could associate this with a note that shows how one computes the output buffer size based on decoder input parameters, and make an exclamation mark about the vast amount of memory that this can easily consume. Such a note could include a mention about decoding an audio file is not a playback time activity, but an audio production time activity, so that developers understand the scope of where such functions are used. This of course hinges on the spec solving #938 first, so that users do not need to look towards decodeAudioData() to solve their basic playback tasks.

@magcius

This comment has been minimized.

Copy link

@magcius magcius commented Sep 14, 2017

I agree with @juj on this. Users likely don't want megabytes of PCM audio data. One option here would be to add support for a frame-based API where the number of samples you get back is possibly variable, and we declare sample-accurate seek support impossible. This is similar to the API of libmad2 where it hands you frames, and it matches emscripten-powered applications do already.

@joeberkovitz

This comment has been minimized.

Copy link
Contributor

@joeberkovitz joeberkovitz commented Sep 15, 2017

When we take up a better decoding API in the next round, we'll certainly look at the question of deprecating old APIs, but please know that removing anything is incredibly difficult even if everyone wishes it would go away. There are many live applications that depend on deprecated APIs and we don't want to break the Web.

@magcius you're welcome to propose something specific for what you'd like to see in #337, but please try to be as concrete as you can -- sample code would be ideal, rather than descriptive language or pointing at other libraries that people may or may not be familiar with.

@juj

This comment has been minimized.

Copy link
Author

@juj juj commented Sep 15, 2017

... removing anything is incredibly difficult even if everyone wishes it would go away. There are many live applications that depend on deprecated APIs and we don't want to break the Web.

Web Audio API has a precedent with doing breaking changes already in the past. setVelocity was removed from Web Audio, breaking every single Emscripten-compiled asm.js/WebAssembly application from starting up in Chrome, and developers had to recompile their pages.

A better handled example is ScriptProcessorNode, which has been deprecated, but without crashing existing sites in day one of deprecation.

The regular protocol of deprecating before removing can applied here as well, nothing special in this case. Reading the conversation history, it does not look that removals of setVelocity and ScriptProcessorNode were particularly difficult processes after everyone agreed to want to do it. (#730 and #253)

Once the replacement APIs are in place, start by offering a warning in console ("this function call is deprecated and just now consumed X seconds of CPU time and Y megs of RAM, consider using Z instead"), and then after a migration period kill it off.

@joeberkovitz

This comment has been minimized.

Copy link
Contributor

@joeberkovitz joeberkovitz commented Sep 15, 2017

True -- and I didn't say "impossible", I said "difficult". All that I'm saying is, everyone needs to agree to do it and on the timeframe for doing it. Figuring out whether/how deprecation takes place is part of the discussion that will take place downstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.