-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio decoder output to regular buffers and not AudioBuffer
#179
Comments
Agreed, We don't have a buffer pool API to hook this into, but there are other parts of WebCodecs that could benefit from such a thing, and I expect we'll work on that in v2. Would you prefer it was changed to a getter method rather than an attribute? |
As discussed with @chcunningham, this is expected to be more important and useful for authors than using This would remove the dependency on Web Audio API V2, and allow this to be self-contained. I'd be open for an implementation to ship with this and not the The general idea is to pass We might be able to do real-zero copy for WASM for the output if we do it correctly, but it means the API will have to diverge even more from what is usually being done (essentially passing in the |
This issue has some overlap with a parallel discussion in #162 (comment) TL;DR - AudioBuffer forces some snapshotting behavior ("acquire the content") which is not super intuitive. What if we do something like this
Basically, hoist up the immutable AudioBuffer methods, deprecate AudioBuffer. This fixes some of our issues while saving the matter of transfering a BYOB for later. |
The above needs sampleFormat at first: planarfloat32 (we can bikeshed on the name) discussed whether to bake "planar" or "interleaved" into single enum that simultaneously describes the format (float32)... no strong feeling. I lean toward doing it in combined enum (for rough symmetry w/ videoframe pixel format) linear -vs- / alaw / ulaw: |
re: sample format, I propose we borrow the naming conventions used in ffmpeg, but drop the AV_SAMPLE_FMT prefix (analagous to the "I420" pixel format used in VideoFrame). Something like this could be a good starting list:
These are the formats Chromium seems to recognized from ffmpeg. LMK if this missing something obvious or if one of these really rare. My first PR will just add FLTP to keep it simple (matching what we already have w/ AudioBuffer). |
I was going to ask whether |
to clarify with an example: |
Small comment on the proposed interface above: Once buffer is removed, there will be no way to check if a frame is closed from JS. There should perhaps be an attribute indicating the frame is closed, and that calling copyFromChannel will throw. |
The mutability of AudioBuffer was undesirable. Also, we like having mor sample formats. See discussion in #179.
isn't the "classic" way to check this is |
Yes. But if we remove buffer, in favor of copyFromChannel(), as per Chris' comment:
We won't be able to check this anymore. Furthermore, the act of checking whether a frame is closed or not could force the potential lazy copy. |
For reference, I dug up the ECMAScript spec on detachment WDYT of trying to follow the TypedArray style by similarly zeroing out analogous fields of AudioData (length, duration, sampleRate, numberOfChannels, ...)? We can make this a clear signal by spec'ing that it is otherwise invalid to construct an AudioData from an empty BufferSource. |
For those reading along, the PR landed with unresolved issues (done to unblock approved dependent PRs). I'll send a follow up shortly. |
Thanks I opened #223 for following up with a link to the context from Dale, you and myself. |
Closing this one, as follow up work now tracked in #223 |
Quite a few programs or library don't really use the Web Audio API with the native nodes etc., and rely on either the
AudioWorklet
or even aScriptProcessorNode
to play their audio. All the DSP happens in user-code (probably WASM), because that's code that is already written, and the Web is "just" another target in this context.It seems like we could add an output mode for our audio decoders that would write to a user-provided buffer, to minimize copy.
This would also sidestep the issue where (for now) the
AudioBuffer
object is always planar f32, and quite a few authors would like integers for memory footprint reasons, and also because that's what codec libraries frequently output, no need for a useless conversion.Forcing the use of
AudioBuffer
s (that have quite a few constraints) when any user of WASM probably has their own audio engine seem like a sub-optimal solution for them. This would also simplify the spec and the code I guess.The text was updated successfully, but these errors were encountered: