Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AudioContext.decodeAudioData() from a SharedArrayBuffer source? #1850

Closed
juj opened this issue Apr 18, 2019 · 19 comments
Closed

AudioContext.decodeAudioData() from a SharedArrayBuffer source? #1850

juj opened this issue Apr 18, 2019 · 19 comments
Assignees

Comments

@juj
Copy link

juj commented Apr 18, 2019

Multithreaded WebAssembly applications utilize a SharedArrayBuffer object as their heap type. These applications can contain encoded audio files inside their heaps, e.g. as part of a filesystem that they embed inside the heap.

However, these applications are unable to decode audio files from within their WebAssembly heaps. Calling

var sab = new SharedArrayBuffer(16);
var ac = new AudioContext();
ac.decodeAudioData(sab);

fails with

TypeError: Argument 1 of BaseAudioContext.decodeAudioData does not implement interface ArrayBuffer.

As a workaround, the applications need to create a temporary copy of the data from a SharedArrayBuffer to a regular ArrayBuffer, e.g.

var sab = new SharedArrayBuffer(16);
var ac = new AudioContext();

var ptr = /*address to start of .ogg file=*/0;
var length = /*length of .ogg file=*/16;
var audioData = new ArrayBuffer(length);
new Uint8Array(audioData).set(new Uint8Array(sab).subarray(ptr, ptr+length));

ac.decodeAudioData(sab);

If Web Audio spec allowed decodeAudioData directly from SABs, then the above deep copy of encoded .ogg files would be avoided in multithreaded WebAssembly applications.

CC @lars-t-hansen

@padenot padenot self-assigned this Apr 18, 2019
@lars-t-hansen
Copy link

This is mostly a specification wrinkle + some WebIDL, I expect; @annevk? From the point of view of the implementation it needs to make sure to use safe-for-races accesses (to avoid C++ UB) where that matters. I see @padenot is paying attention already so not going to do anything here myself.

@padenot
Copy link
Member

padenot commented Apr 23, 2019

This is something that would be very nice to have indeed, however it's not without problems.

The main issue I see is that decodeAudioData has to be the sole owner of the memory region on which it operates, for obvious safety reasons. In short, internally, we send this buffer to another browser subsystem, which does the decoding. This can in turn use various ways of doing the decoding: some implementation send it to ffmpeg, some use the system's facilities, some use a library to do the decoding. On mobile, it's not uncommon to have some hardware helping with the decoding process (most often for mp3 and AAC, iirc). This all depend on the codec, implementation and platform.

Having the possibility to mutate the bytes while the decoder is working probably cannot be proved safe in practice (due to the multiplicity of scenarios, a lot of which browser vendors don't have any control on). It used to be that the buffer wasn't detached, and this caused more than one security issue, iirc.

I'm particularly concerned by codecs that try to keep the memory usage down and locality high by not copying information out of the container, and instead referencing bits of memory. Let's say you get a length out of the container, at a certain offset, validated initially, and then a malicious bit of code changes the length under the codec, that still references this memory, and this can potentially cause an OOB. I can't prove that this is something that is not done today in code that we don't control, and I can't prove that this is something that cannot be done in the future.

In addition (and this is an implementation concern, not a spec concern), Firefox is now running the codecs in a separate process (because they are very complex piece of code, we're sandboxing them harder than the rest of the web content). This means that we would copy anyways (either to a SHMEM, or via a pipe, etc.), to decode. However this does it by block by block so the copies are smaller and the memory usage is not doubled.

Is there a facility in SharedArrayBuffer to mark a range as read-only for WASM for a period of time, with a lifetime for this read-only period dictated by the implementation? This would solve our issue nicely here, if all we want is to save the copy (and I presume saving a pair of allocation/deallocation, and having a transient higher memory usage).

Another thing that could be interesting would be to decode into a SharedArrayBuffer. That's probably easier, and I think it would be quite important, PCM data is often quite big, and authors using WASM would be happy to skip a copy I'm sure.

@lars-t-hansen
Copy link

Is there a facility in SharedArrayBuffer to mark a range as read-only for WASM for a period of time, with a lifetime for this read-only period dictated by the implementation?

Not at this time. We're talking about adding mprotect-like features to wasm in the future but it's unknown how that would interact with the needs of the audio subsystem -- I would think it might just make things even more complicated.

@annevk
Copy link

annevk commented Apr 23, 2019

From an IDL perspective, you'll need to use [AllowShared] for starters. There's then also the question as to whether the SharedArrayBuffer object can be used to create a high-resolution timer, assuming postMessage() functionality for it was disabled.

That is, we want to enable SharedArrayBuffer everywhere, but guard any usage of it that can create a high-resolution timer. If your feature can be used to create a high-resolution timer independent of other features (in particular including the sharing it with other threads one), you'll need an additional opt-in.

Does that make sense?

@hoch
Copy link
Member

hoch commented Apr 23, 2019

Is there a facility in SharedArrayBuffer to mark a range as read-only for WASM for a period of time, with a lifetime for this read-only period dictated by the implementation?

For the same reason, it would be great if we can use SAB for Audio Worklet's input and output buffer. The mechanism @padenot described above will be really helpful in any use case where the system (renderer, decoder and etc) needs to claim the exclusive access to a given memory space (or range), then releases the ownership when the task is completed.

@lars-t-hansen
Copy link

I don't really see that kind of mechanism being possible until wasm has something like mprotect and exception handling and we can phrase the semantics of the protection in those terms. In that case, we may be able to treat the audio subsystem simply as a concurrent agent that protects/unprotects parts of shared memory, and we'll have a model for what it means when other agents touch the protected area.

@rtoy rtoy added this to the Web Audio v.next milestone Apr 25, 2019
@padenot
Copy link
Member

padenot commented Apr 25, 2019

I think it's best to wait until wasm/SAB has the facilities needed here.

@lars-t-hansen, if discussions happens on something like mprotect/etc., can one of us be notified? I don't think we can do anything without it.

@hoch
Copy link
Member

hoch commented Apr 25, 2019

we may be able to treat the audio subsystem simply as a concurrent agent that protects/unprotects parts of shared memory

@lars-t-hansen I really like that idea. This new design will help a lot of performance-critical audio use cases.

We're punting this to v.next, but Audio WG might not be the right venue for discussion. Web Audio API or the other audio API will be a customer of such design.

@juj
Copy link
Author

juj commented May 3, 2019

Solving this at a higher level, if #1305 happened, and a more efficient cursor-based decoding API was available, then that could be used as a replacement. In such scenario, many applications would be able to avoid having to stuff the compressed audio files inside the wasm heap in the first place, if the browser was able to decode audio for them.

@mdjp
Copy link
Member

mdjp commented Sep 16, 2019

Closing this, it is not an area of the api that we will be working on in V2. However we hope that this
will be covered by https://github.com/WICG/web-codecs
https://github.com/WICG/web-codecs/blob/master/explainer.md
https://discourse.wicg.io/t/webcodecs-proposal/3662

@mdjp mdjp closed this as completed Sep 16, 2019
@padenot
Copy link
Member

padenot commented Sep 16, 2019

(It's a requirement of the web-codec effort to work well with the web audio api).

@guest271314

This comment was marked as off-topic.

@hoch
Copy link
Member

hoch commented Jan 21, 2020

It's a requirement of the web-codec effort to work well with the web audio api

Probably it's better to get that in the WebCodec explainer. :)

@guest271314

This comment was marked as off-topic.

@hoch
Copy link
Member

hoch commented Jan 21, 2020

Oh, I was merely quoting @padenot's comment - but "working well" means many things in many different layers; performance, API ergonomics, and ease of use. I won't try to clarify everything in this comment or this thread.

@guest271314

This comment was marked as off-topic.

@rtoy
Copy link
Member

rtoy commented Jan 21, 2020

The feature request template is being removed. We'll probably need to add some words somewhere that feature requests should go to the issues for the v2 repo.

@guest271314

This comment was marked as off-topic.

@rtoy
Copy link
Member

rtoy commented Jan 21, 2020

Don't know how it will work, but having two apis that overlap a little is not a great pattern. It seems more beneficial for everyone if WebCodecs can solve this, rather than grafting some kind of partial solution to decodeAudioData but not be able to do other things that WebCodecs can do. It's up to you and us to do our best to make it happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants