Audio decoder output to regular buffers and not `AudioBuffer` #179

padenot · 2021-04-09T17:02:40Z

Quite a few programs or library don't really use the Web Audio API with the native nodes etc., and rely on either the AudioWorklet or even a ScriptProcessorNode to play their audio. All the DSP happens in user-code (probably WASM), because that's code that is already written, and the Web is "just" another target in this context.

It seems like we could add an output mode for our audio decoders that would write to a user-provided buffer, to minimize copy.

This would also sidestep the issue where (for now) the AudioBuffer object is always planar f32, and quite a few authors would like integers for memory footprint reasons, and also because that's what codec libraries frequently output, no need for a useless conversion.

Forcing the use of AudioBuffers (that have quite a few constraints) when any user of WASM probably has their own audio engine seem like a sub-optimal solution for them. This would also simplify the spec and the code I guess.

The text was updated successfully, but these errors were encountered:

sandersdan · 2021-04-12T17:37:56Z

Agreed, AudioBuffer is limiting. AudioFrame was designed to minimize dependence on it (it only needs to be created when accessed) so we can add new ways to use the audio data in a backwards-compatible way.

We don't have a buffer pool API to hook this into, but there are other parts of WebCodecs that could benefit from such a thing, and I expect we'll work on that in v2.

Would you prefer it was changed to a getter method rather than an attribute?

padenot · 2021-04-27T17:26:39Z

As discussed with @chcunningham, this is expected to be more important and useful for authors than using AudioBuffer, and this would be in addition to AudioBuffer, not instead, modulo the design point mentioned in https://github.com/WebAudio/web-audio-api-v2/issues/119#issuecomment-808330658.

This would remove the dependency on Web Audio API V2, and allow this to be self-contained. I'd be open for an implementation to ship with this and not the AudioBuffer variant, if the API allows for feature detection.

The general idea is to pass ArrayBuffers for input and output, detaching the buffers, and then giving back both buffers on output, to allow authors to implement their own recycling if they want. This is not a standard pattern (yet) but it should be, since it's the closest thing to having good memory access patterns until there is read only memory.

We might be able to do real-zero copy for WASM for the output if we do it correctly, but it means the API will have to diverge even more from what is usually being done (essentially passing in the Memory object, an offset, and a length).

chcunningham · 2021-04-28T15:35:21Z

This issue has some overlap with a parallel discussion in #162 (comment)

TL;DR - AudioBuffer forces some snapshotting behavior ("acquire the content") which is not super intuitive.

What if we do something like this

] interface AudioFrame {
    constructor(AudioFrameInit init);

    readonly attribute long long timestamp;  // microseconds

    // Deprecated. Will make a lazy copy.
    readonly attribute AudioBuffer buffer;

    readonly attribute unsigned long length; // in sample-frames
    readonly attribute double duration; // in seconds
    readonly attribute float sampleRate; // in sample-frames per second

    // Channel access
    readonly attribute unsigned long numberOfChannels;
    void copyFromChannel(
        Float32Array destination,
        unsigned long channelNumber,
        optional unsigned long bufferOffset = 0);

  // Creates a AudioFrame, which needs to be independently closed.
  AudioFrame clone();
  void close();
};

Basically, hoist up the immutable AudioBuffer methods, deprecate AudioBuffer. This fixes some of our issues while saving the matter of transfering a BYOB for later.

chcunningham · 2021-04-28T16:10:55Z

The above needs sampleFormat

at first: planarfloat32 (we can bikeshed on the name)

discussed whether to bake "planar" or "interleaved" into single enum that simultaneously describes the format (float32)... no strong feeling. I lean toward doing it in combined enum (for rough symmetry w/ videoframe pixel format)

linear -vs- / alaw / ulaw:
resolved: lets only have linear in "frames". treat alaw/ulaw as "encoded" - would be acceptable in chunks (we'll decode to linear)

chcunningham · 2021-04-29T00:04:42Z

re: sample format, I propose we borrow the naming conventions used in ffmpeg, but drop the AV_SAMPLE_FMT prefix (analagous to the "I420" pixel format used in VideoFrame).

Something like this could be a good starting list:

U8,          ///< unsigned 8 bits
S16,         ///< signed 16 bits
S32,         ///< signed 32 bits
FLT,          ///< float
S16P,        ///< signed 16 bits, planar
S32P,        ///< signed 32 bits, planar
FLTP,        ///< float, planar

These are the formats Chromium seems to recognized from ffmpeg. LMK if this missing something obvious or if one of these really rare. My first PR will just add FLTP to keep it simple (matching what we already have w/ AudioBuffer).

padenot · 2021-04-29T14:50:48Z

I was going to ask whether S32 was in fact used to represent 24-bits integer audio samples (that are quite common), but your link answers that (if we're doing it the same way ffmpeg does it, which I think is sensible, for the same reasons: there is no 24-bits type in WASM either, and nobody wants to do computations by hand).

padenot · 2021-04-29T14:52:39Z

resolved: lets only have linear in "frames". treat alaw/ulaw as "encoded" - would be acceptable in chunks (we'll decode to linear)

to clarify with an example: ulaw would decode to 16-bits integer so that the dynamic range is not lost.

tguilbert-google · 2021-04-29T22:03:20Z

Small comment on the proposed interface above:

Once buffer is removed, there will be no way to check if a frame is closed from JS. There should perhaps be an attribute indicating the frame is closed, and that calling copyFromChannel will throw.

The mutability of AudioBuffer was undesirable. Also, we like having mor sample formats. See discussion in #179.

padenot · 2021-04-30T09:54:12Z

there will be no way to check if a frame is closed from JS

isn't the "classic" way to check this is buffer.length == 0 ? I reckon it's not explicit per se, but it's what authors to today to check whether a buffer has been detached.

tguilbert-google · 2021-04-30T15:37:00Z

there will be no way to check if a frame is closed from JS

isn't the "classic" way to check this is buffer.length == 0 ? I reckon it's not explicit per se, but it's what authors to today to check whether a buffer has been detached.

Yes. But if we remove buffer, in favor of copyFromChannel(), as per Chris' comment:

// Deprecated. Will make a lazy copy.
readonly attribute AudioBuffer buffer;

We won't be able to check this anymore. Furthermore, the act of checking whether a frame is closed or not could force the potential lazy copy.

chcunningham · 2021-05-04T21:34:39Z

For reference, I dug up the ECMAScript spec on detachment
https://tc39.es/ecma262/#sec-detacharraybuffer

WDYT of trying to follow the TypedArray style by similarly zeroing out analogous fields of AudioData (length, duration, sampleRate, numberOfChannels, ...)? We can make this a clear signal by spec'ing that it is otherwise invalid to construct an AudioData from an empty BufferSource.

chcunningham · 2021-05-05T04:05:57Z

For those reading along, the PR landed with unresolved issues (done to unblock approved dependent PRs). I'll send a follow up shortly.

padenot · 2021-05-05T14:50:59Z

Thanks I opened #223 for following up with a link to the context from Dale, you and myself.

chcunningham · 2021-05-12T04:32:37Z

Closing this one, as follow up work now tracked in #223

padenot mentioned this issue Apr 28, 2021

Rename AudioFrame->AudioData, drop AudioBuffer, add ref counting semantics. #162

Merged

chcunningham mentioned this issue Apr 28, 2021

Clarify AudioFrame data 'snapshotting' behavior for ctor() and clone() #197

Closed

chcunningham added a commit that referenced this issue Apr 30, 2021

Rename AudioFrame->AudioData. Drop dependency on AudioBuffer.

c2586e1

The mutability of AudioBuffer was undesirable. Also, we like having mor sample formats. See discussion in #179.

chcunningham added this to the V1 Launch Blockers milestone May 5, 2021

This was referenced May 6, 2021

Define AudioDataCopyToOptions with frameOffset and frameCount #228

Merged

Converting AudioSampleFormats for rendering #232

Closed

chcunningham closed this as completed May 12, 2021

chcunningham mentioned this issue Jul 14, 2021

Should AudioSampleFormat enums values be lowercase? #300

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio decoder output to regular buffers and not `AudioBuffer` #179

Audio decoder output to regular buffers and not `AudioBuffer` #179

padenot commented Apr 9, 2021

sandersdan commented Apr 12, 2021 •

edited

Loading

padenot commented Apr 27, 2021

chcunningham commented Apr 28, 2021 •

edited

Loading

chcunningham commented Apr 28, 2021

chcunningham commented Apr 29, 2021

padenot commented Apr 29, 2021

padenot commented Apr 29, 2021

tguilbert-google commented Apr 29, 2021

padenot commented Apr 30, 2021

tguilbert-google commented Apr 30, 2021

chcunningham commented May 4, 2021

chcunningham commented May 5, 2021

padenot commented May 5, 2021

chcunningham commented May 12, 2021

Audio decoder output to regular buffers and not AudioBuffer #179

Audio decoder output to regular buffers and not AudioBuffer #179

Comments

padenot commented Apr 9, 2021

sandersdan commented Apr 12, 2021 • edited Loading

padenot commented Apr 27, 2021

chcunningham commented Apr 28, 2021 • edited Loading

chcunningham commented Apr 28, 2021

chcunningham commented Apr 29, 2021

padenot commented Apr 29, 2021

padenot commented Apr 29, 2021

tguilbert-google commented Apr 29, 2021

padenot commented Apr 30, 2021

tguilbert-google commented Apr 30, 2021

chcunningham commented May 4, 2021

chcunningham commented May 5, 2021

padenot commented May 5, 2021

chcunningham commented May 12, 2021

Audio decoder output to regular buffers and not `AudioBuffer` #179

Audio decoder output to regular buffers and not `AudioBuffer` #179

sandersdan commented Apr 12, 2021 •

edited

Loading

chcunningham commented Apr 28, 2021 •

edited

Loading