-
Notifications
You must be signed in to change notification settings - Fork 11
Request to Expose AudioBuffer to DedicatedWorker #111
Comments
We need to reason about the (somewhat) hidden cost of the interleaving/deinterleaving operations that we'll face, and the cost of the possible (probable) memory inflation that this might cause. Generally, decoding, encoding and IO APIs are using interleaved audio buffer, so that temporal locality is better: samples that are close in time are close in memory (they are played out, recorded, encoded and decoded together). Processing APIs often like deinterleaved (also sometimes called planar by analogy with the graphics world) buffers, because processing a particular audio channel with a particular algorithm is more efficient/practical this way. The Web Audio API always use planar audio, because its main goal is processing. It happens to have a decoding API for historical reasons, that is made to inter-operate with the rest of the API, and this contradicts the above. Additionally, the Web Audio API always uses 32-bits floating point numbers to represent audio samples, when exposed to authors. We often see codecs output (and input) in 16-bits integers (or less, or more, but 16-bits is the most common). This is a 2-fold increased in memory footprint for no real good reasons, that led Firefox for Android to have a lazy inflation scheme, where audio samples stay in 16-bits integer (what the decoder provides) until they need to be exposed to script, this works well but is implicit. https://github.com/WebAudio/web-audio-api-v2/issues/11 aims at making this explicit and allow further optimizations. It might well be that we're okay with this, and then implementation do those inflation/reordering operations in a lazy fashion. We could also do a spin-off of https://github.com/WebAudio/web-audio-api-v2/issues/11 and allow interleaved audio, alongside different sample-representation. This seems like a better way to do things, and doesn't seem particularly hard (no harder than what is implemented today in Gecko). |
I think this is about exposing AudioBuffer to DedicatedWorker, not to AudioWorkletGlobalScope. Paul, is your concern mostly about the inflated memory footprint? |
And CPU, memory traffic, etc., yes. Basically, unnecessary inefficiencies at the API boundary because the type that exists today is not what a codec API needs. Just exposing it is fine, it's not hard. Indeed, this has nothing to do with |
I agree with that we need to look into the hidden implication, but exposing AudioBuffer itself seems harmless and it would be useful for WebCodec's use cases. |
(Moving this to V2 because we're not changing V1 text at this stage). |
For reference: crbug.com/1160580 |
I would like to know what the actual use case is and whether AudioBuffer is the right api, as @padenot mentions. Perhaps there's a better solution. |
AudioBuffer is used by WebCodecs to describe raw audio. It is essentially the output of AudioDecoder and the input to AudioEncoder. In both cases, we wrap the buffer in an AudioFrame to add some metadata and the ability to immediately release the buffer without waiting for GC (more important for VideoFrame, but consistency is good). We like AudioBuffer for several reasons
Edit: Sorry, I saw @rtoy's comment in my email before seeing @padenot's earlier comments. I happily defer to @padenot and others expertise on the performance issues he outlines.
But I acknowledge Paul's point about other codec APIs not forcing planar float32. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@chcunningham If the current But this also allows us to update |
I just checked and Chromium's and ffmpeg's objects specifically supports what I'm talking about here (possibility of having interleaved samples, various bit depths and sample representation in their decoded audio data type). A comment on top of Chrome's declaration also mention that a conversion to an If we extend |
Why delete comments critical of decisions based on conjecture without evidence? Is this
not true and correct? If so, where are the tests? |
Comments were not deleted. I can still see them if I click them. |
@padenot Thanks for the context! As a someone who was not involved in WebCodec's design, I am curious. Then why does AudioFrame include AudioBuffer? I believe AudioBuffer is specifically designed for Web Audio, so non-interleaved format only makes sense. |
The full reasoning is in my comment above. I made the call pretty much independently without consideration of the tradeoffs we're now discussing. Still, I think those points in favor of using AudioBuffer are compelling and it sounds like we have a solid path forward to resolve the issues raised here. @hoch do you agree?
If WebCodecs were to create its own AudioBuffer, it would be pretty much identical to WebAudio's except, perhaps, offering more formats (and clearly that is not a v1 requirement, even for WebCodecs). If we can instead add those formats to AudioBuffer I think it simplifies how raw audio is described on the web platform and makes for clean integration with WebAudio. |
I discussed the GC pressure issue with @chcunningham and we believe Chromium's GC can handle the pressure from the rapid AudioBuffer object creation from the WebCodecs API. @padenot might be able to speak for FireFox's view on this. Here I see two paths:
The option 2 seems non-controversial. I am curious what other people think.
This deserves a separate issue in this tracker. Definitely not a V1 or short-term project. |
Yes. I agree. |
@padenot I understand we generally agree on this. Right? |
Given user-defined |
I have not found WebCodecs 'opus' Nonetheless, given One way to generate
|
Exposing
The issue is the wrapping in an An exposed
Fixing that should be the priority. |
WICG banned me. Otherwise I would strongly suggest filing a PR in WebCodecs specification that includes the language that |
I am curious why What is the reasoning for using a
|
Looks like the build is happy now! I've updated the opening comment w/ links to chromium bug and WPT tests. Please let me know if anything else is needed. |
@chcunningham this will happen automatically when we expose the Web Audio API in Web Workers, and Web Codecs doesn't need it anymore, so I'm closing this, thanks! |
WebCodecs uses AudioBuffer as a container of unencoded audio.
https://wicg.github.io/web-codecs/#dom-audioframe-buffer
The AudioFrame, which wraps AudioBuffer, is provided as an input to AudioEncoder and as an output from AudioDecoder. We've exposed the encoder and decoder interfaces to DedicatedWorker to facilitate offloaded codec IO.
I think the change is as simple as changing the Exposed= parameter for just AudioBuffer. I'll send a PR to demonstrate.
The text was updated successfully, but these errors were encountered: