Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DecodeAudioData doesn't work on partial content #337

Closed
nums opened this issue Jun 11, 2014 · 29 comments
Closed

DecodeAudioData doesn't work on partial content #337

nums opened this issue Jun 11, 2014 · 29 comments

Comments

@nums
Copy link

nums commented Jun 11, 2014

Hello,

When we want to use it for audio streaming like media source api : https://dvcs.w3.org/hg/html-media/raw-file/tip/media-source/media-source.html
The DecodeAudioData does not work when trying to decode a partial content while the audio tag play very well this content.

example : http://dashif.org/reference/players/javascript/0.2.0/index.html

@rtoy
Copy link
Member

rtoy commented Jun 13, 2014

I believe that is the intent for decodeAudioData. If you want to stream data to WebAudio, you can use a MediaElementAudioSourceNode.

@nums
Copy link
Author

nums commented Jun 19, 2014

Hello,

Thanks for your reply.

Through a MediaElementSource, I have to go through a JavaScriptNode to recover audiobuffer in real time. I did not find an other way to do that without JavascriptNode.
It will be great if Web Audio API could natively decode partial content.

@jernoble
Copy link
Member

On Jun 19, 2014, at 2:41 AM, Emmanuel Freard notifications@github.com wrote:

Hello,

Thank you for your reply.

Through a MediaElementSource, I have to go through a JavaScriptNode to recover audiobuffer in real time. I did not find an other way to do that without JavascriptNode.
It will be great if Web Audio API could natively decode partial content.

I agree. It would be especially useful when paired with a queueing AudioBufferSourceNode, where partial output from the decoder could be appended to an existing node.

-Jer

@ghost
Copy link

ghost commented Oct 23, 2014

I'm a bit late to this one but I can't see how the decoding of partial content could be done automatically with decodeAudioData(). File formats such as OGG Vorbis (and MP3 if I remember correctly) store the information required to decode the audio in the file header. The decoder could store that information somewhere but it would have no idea if N number of calls to decodeAudioData() were sequential and/or partial chunks from the same source file. Also, the structure of the source file needs to be considered, some formats can only be decoded in chunks (frames) of specific sizes.

It's actually possible to "stream" audio now using decodeAudioData() if the programmer understands the format of a particular audio file. I managed to do this with OGG Vorbis (load and decode it in chunks) a while ago by storing the required decoding information and prepending it to the beginning of each chunk prior to decoding. The data loading and splicing can be done in a worker and then transferred (zero-op) to the main thread for decoding.

That being said, the existing media elements do all of this already so you would need a really good reason to not use them for streaming, IHMO :)

@cwilso
Copy link
Contributor

cwilso commented Oct 29, 2014

I think we made the decision to push more powerful decoding API to v.next?

@cwilso cwilso added this to the Needs WG decision milestone Oct 29, 2014
@padenot
Copy link
Member

padenot commented Oct 29, 2014

That's what I remember as well.

@joeberkovitz joeberkovitz modified the milestones: Web Audio v.next, Needs WG decision Nov 13, 2014
jernoble added a commit to jernoble/web-audio-api that referenced this issue Jan 28, 2015
Add a new class, AudioStreamParser, modeled on the SourceBuffer from the Media Source Extensions <http://w3c.github.io/media-source/> specification, which allows for encoded audio data to be progressively appended to the object, resulting in one or more AudioBuffers.

resolves WebAudio#337
@rtoy
Copy link
Member

rtoy commented Jun 24, 2016

Based on the comments in #337 (comment) and #337 (comment),

I think we've decided to move this to v.next. Adding Needs WG review for a final decision.

@samelie
Copy link

samelie commented Jun 29, 2016

+1

Posted a question here

@jdsmith3000
Copy link
Contributor

Discussed at today's Working Group call and agreed to move to v.next, along with issue #30.

@bjornm
Copy link

bjornm commented Nov 8, 2017

At Soundtrap, we have custom codecs using web assembly compiled ogg/vorbis. These are exposed in async streaming js apis to allow you to decode audio data in chunks. You specify the offset and number of frames and get an audio buffer back. This, in in turn can be used to skip or loop. It can run in its own js worker.

This streamed data is currently scheduled in 1s chunks in back-to-back AudioBufferSourceNodes to emulate an audio queue with gapless playback. While it works (and is fun!), I'm really looking forward to replacing that with a streaming decode audio buffer api and audio worklets.

If anyone in the working group is interested, I'd be happy to demo the whole use case and the code to better illustrate the problem we're trying to solve and the api's we ended up writing.

@bjornm
Copy link

bjornm commented Mar 11, 2018

Someone asked me to explain a bit more about the Soundtrap javascript audio codec library we use internally.

The audio codec library is written in dart / javascript / web assembly. It does not handle any networking aspect of the decoding / encoding, but it works on typed arrays (e.g. Float32Array and UInt8Array). These could come from the network, from memory or from indexed db, so they work in a variety of use cases. There are some abstractions in the api to allow for flexibility in implementation, pluggability and the language used.

AudioCodec
The AudioCodec class contains a registry of the available codecs for the platform and provides methods to get such decoders.
AudioCodec.canDecode('mp3') => true
AudioCodec.canEncode('flac') => false
etc

It acts as a pluggable registry so we can implement and register codecs in AudioCodec independently of any client code asking to use the codecs.
AudioCodec.registerEncoder(...class implementing AudioEncoder...)
AudioCodec.registerDecoder(...class implementing AudioDecoder...)

AudioDecoder and StreamingAudioDecoder
When wanting to decode some byte data:
var decoder = AudioCodec.getDecoder('mp3', bytes);
var metaData = decoder.getMetaData();
// metaData.channels
// metaData.sampleRate
// metaData.length
var offset = 0;
var length = 10000;
var chunk = await decoder.decodeChunk(offset, length); // returns a Float32Array[] with one array per channel of audio. Asynchronous promise returned
...
decoder.end();

Implementations
Streamable implementations of AudioDecoder provide the methods above - getMetaData() and decodeChunk(). Some of them are implemented in javascript (e.g. aif decoder), some in wasm (e.g. ogg/vorbis), some hybrid of javascript and web audio (using AudioContext.decodeAudio). In cases where we cannot (yet) stream audio, there is a simple AudioDecoder interface without the chunked methods, simply returning the full result directly.
var full = await AudioDecoder.decodeFully(...); // returns Float32Array[] or AudioBuffer

AudioEncoder and StreamingAudioEncoder
Similar interface to AudioDecoder but takes Float32Array and returns Uint8List. Depending on the codec, the constructor will take options specific to the codec (e.g. bitrate).
var encoder = AudioCodec.getEncoder(numChannels, sampleRate, options);
var chunk = await encoder.encodeChunk(channelData); // Passes in Float32Array[] and returns Uint8Array or ByteBuffer

As can be seen from the example above, the returned decoder is searchable, since you can specify offset and length for the decodeChunk(). This allows us to loop audio while playing back in a gapless fashion.

A limitation of the example above is that all encoded data is passed in the constructor, which prevents streaming from the network, and leads to waste of memory. A future version will replace the "bytes" argument with something like the Streams API, agnostic of memory / network / disk, where the encoded data stream can be searched and read in order to support the decodeChunk() call. e.g.
Stream {
Promise seek(int offset);
Promise read(int count);
Promise close();
}

@anthumchris
Copy link

@nums I'm writing a demo and providing code examples for decoding audio in chunks with Streams. This started as a proof of concept to bypass the current limitation of decodeAudioData() requiring complete files: AnthumChris/fetch-stream-audio

@bjornm I took your advice and started with WAV decoding since it's easier and this is my first experience coding with audio. Good foresight.

@anthumchris
Copy link

Here's a JavaScript Ogg Opus decoder that uses Wasm and libopusfile to decode streams in chunks:

https://github.com/AnthumChris/opus-stream-decoder

@mdjp
Copy link
Member

mdjp commented Sep 17, 2019

This will now be handled by https://discourse.wicg.io/t/webcodecs-proposal/3662

@mdjp mdjp closed this as completed Sep 17, 2019
V2 Preparation (DO NOT USE) automation moved this from To do to Done Sep 17, 2019
@guest271314

This comment was marked as off-topic.

@hillct
Copy link

hillct commented Jan 19, 2020

Cannot decodeAudioData() be modified to do this [#337 (comment)]

Almost certainly. The argument for implementing it directly within decodeAudioData() would be the significant performance benefit you'd get, relative to implementing stream filtering based upon certain file stream tags, with repeated calls to decodeAudioData() to achieve the same effect.

I managed to do this with OGG Vorbis (load and decode it in chunks) a while ago by storing the required decoding information and prepending it to the beginning of each chunk prior to decoding.

We used the same approach to achieve streaming (partial download) segmented decoding of MP3 files, but network performance variability rendered any buffering that can be implemented at the JS level, insufficient, to achieve reliable audio playback without drops and other artifacts.

As such, it need to be implemented at a lower level, within the API, where more reliable buffering can be achieved.

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@padenot
Copy link
Member

padenot commented Jan 20, 2020

Please have this discussion elsewhere, thanks.

@guest271314

This comment was marked as off-topic.

@padenot
Copy link
Member

padenot commented Jan 20, 2020

New feature requests go in the v2 repo. New feature requests about audio decoding and encoding go in the Web Codecs repo, but this is use-case has been handled from day one there.

@guest271314

This comment was marked as off-topic.

@sahi1422
Copy link

I still don't understand how to decode partial audio data. Can someone help?
Thanks.

@rtoy
Copy link
Member

rtoy commented Nov 12, 2020

I still don't understand how to decode partial audio data. Can someone help?

You can't with decodeAudioData directly. It expects the entire file and expects to decode all of it.

You will have to do it yourself. WebCodecs will help here, I think.

@juj
Copy link

juj commented Sep 24, 2021

New feature requests go in the v2 repo. New feature requests about audio decoding and encoding go in the Web Codecs repo, but this is use-case has been handled from day one there.

You will have to do it yourself. WebCodecs will help here, I think.

Tried to use WebCodecs today to do this with WebCodecs (unfortunately failing so far), and posted w3c/webcodecs#366 about my findings so far. Any help there would be appreciated!

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

@guest271314

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests