TAG Issue: Layering considerations #257

chrislo · 2013-10-17T12:33:29Z

The following point was raised by the W3C TAG as part of their review of the Web Audio API. In it a number of issues are raised, which we can split into separate issues if required. For now let's capture our response in this issue.

Layering Considerations

Web Audio is very low-level and this is a virtue. By describing a graph that operates in terms of samples of bytes, it enables developers to tightly control the behavior of processing and ensure low-latency delivery of results.

Today's Web Audio spec is an island: connected to its surroundings via loose ties, not integrated into the fabric of the platform as the natural basis and explanation of all audio processing -- despite being incredibly fit for that purpose.

Perhaps the most striking example of this comes from the presence in the platform of both Web Audio and the <audio> element. Given that the <audio> element is incredibly high-level, providing automation for loading, decoding, playback and UI to control these processes, it would appear that Web Audio lives at an all-together lower place in the conceptual stack. A natural consequence of this might be to re-interpret the <audio> element's playback functions in terms of Web Audio. Similar descriptions can happen of the UI in terms of Shadow DOM and the loading of audio data via XHR or the upcoming fetch() API. It's not necessary to re-interpret everything all at once, however.

Web Audio acknowledges that the <audio> element performs valuable audio loading work today by allowing the creation of SourceNode instances from them:

/***********************************
  * 4.11 The MediaElementAudioSourceNode Interface
  **/
var mediaElement = document.getElementById('mediaElementID');
var sourceNode = context.createMediaElementSource(mediaElement);
sourceNode.connect(filterNode);

Lots of questions arise, particularly if we think of media element audio playback as though it's low-level aspects were described in terms of Web Audio:

Can a media element be connected to multiple AudioContexts at the same time?
Does ctx.createMediaElementSource(n) disconnect the output from the default context?
If a second context calls ctx2.createMediaElementSource(n) on the same media element, is it disconnected from the first?
Assuming it's possible to connect a media element to two contexts, effectively "wiring up" the output from one bit of processing to the other, is it possible to wire up the output of one context to another?
Why are there both MedaiaStreamAudioSourceNode and MediaElementAudioSourceNode in the spec? What makes them different, particularly given that neither appear to have properties or methods and do nothing but inherit from AudioNode?

All of this seems to indicate some confusion in, at a minimum, the types used in the design. For instance, we could answer a few of the questions if we:

Eliminate MediaElementAudioSourceNode and instead re-cast media elements as possessing MediaStream audioStream attributes which can be connected to AudioContexts
Remove createMediaElementSource() in favor of createMediaStreamSource()
Add constructors for all of these generated types; this would force explanation of how things are connected.

That leaves a few open issues for which we don't currently have suggestions but believe the WG should address:

What AudioContext do media elements use by default?
Is that context available to script? Is there such a thing as a "default context"?
What does it mean to have multiple AudioContext instances for the same hardware device? Chris Wilson advises that they are simply sum'd, but how is that described?
By what mechanism is an AudioContext attached to hardware? If I have multiple contexts corresponding to independent bits of hardware...how does that even happen? AudioContext doesn't seem to support any parameters and there aren't any statics defined for "default" audio contexts corresponding to attached hardware (or methods for getting them).

The text was updated successfully, but these errors were encountered:

domenic · 2014-03-30T04:30:57Z

@jernoble would you be able to use your https://github.com/jernoble/Sound repo to answer @slightlyoff's original set of questions above? Especially the last set?

cwilso · 2014-03-31T04:08:21Z

Sort of. Jer's sound.js repo just calls decodeAudioData() for all files - so it's not going to stream, it will need a complete file before it will start. We'd need to implement codecs (or expand the decoding API in Web Audio) to make it complete.

cwilso · 2014-04-17T16:19:43Z

What we need to respond to this:

gap analysis with media elements
answers to specific questions above
model for "default audio context"

joeberkovitz · 2015-06-02T19:07:46Z

Next step: write a response to the TAG issues, including this one.

joeberkovitz · 2015-10-08T13:39:00Z

[...this is stuck in the same place as #250...] @cwilso has there been any discussion with TAG? We need to address this in order to get the spec into V1 shape. Please let the group know if you are still on this, or if the chairs should find another way to resolve.

joeberkovitz · 2015-10-26T05:47:55Z

TPAC: this is a big chunk that we should make top of the list for post-V1 action, and take a fresh look with the TAG at a cross-WG discussion.

kirbysayshi · 2017-02-10T15:41:34Z

One issue I've run into recently: it's impossible to accurately schedule playback (start/pause) of an HTMLMediaElement connected to an audioContext. This means there is no ability (as far as I know) to schedule accurate playback of either a large audio file (requiring streaming) or an EME-protected file.

Is accurate scheduled playback also addressed by this issue? Or a separate concern?

padenot · 2017-02-13T09:19:10Z

It's separate. This is about speccing the HTMLMediaElement in terms of the Web Audio API and other specs, so the Web Platform is, in a way, "layered", offering lower-level primitives that allow to reimplement higher-level APIs. Authors, depending on their needs, would target different sets of API (and ideally combine them).

For accurate scheduling of long media element, I don't think there is a perfect, built-in way to do this at the moment. Authors have had good experience of using stiched AudioBufferSourceNode with either decoding the media files in javascript, or using a codec that lets them easily split and stich using decodeAudioData (e.g., vorbis, opus), with minimal js code. It all depends on the use-case. For example, does "accurate" means "sample accurate", or is 10ms of scheduling jitter ok ? In general, phrasing a problem in terms of real-world use-case allows for a better framing of the issue. Maybe you could open a separate github issue where we could discuss what does not currently work?

Being able to pipe EME-protected content in a Web Audio API using a MediaElementAudioSourceNode, or use decodeAudioData on an EME-encrypted blob, is, for now, restricted for obvious reasons, but contacting browser vendors would be the right way forward here. For Mozilla, padenot@mozilla.com would work.

kirbysayshi · 2017-02-14T02:32:41Z

Ok, sorry for causing noise on this issue, and thank you for your thoughtful response! To answer your question: accurate means sample accurate (10ms of schedule jitter is perceptible in applications like beat-synced triggering of <audio> elements). Since HTMLMediaElement#play is the only interface for EME playback, the stitched-AudioBufferSourceNode solution unfortunately doesn't allow sample-accurate playback of EME audio (as far as I know, there is no way to get decoded buffers out of EME precisely).

I'll raise a separate issue with more details.

padenot · 2019-09-17T08:20:59Z

It's been decided to do the decoding part of things in a different spec, because the Web Audio API really is about processing.

The Web Audio API can do everything that is needed on the playback side of the HTMLMediaElement.

cwilso self-assigned this Apr 17, 2014

mdjp added Editorial/Documentation and removed TAGFeedback labels Jun 25, 2014

joeberkovitz added the V1 (TPAC 2014) label Oct 22, 2014

This was referenced Oct 23, 2014

(HTMLMediaElementSync): HTMLMediaElement synchronisation #78

Closed

Inter-app audio (Consistent request from ISVs, a la VSTs, Audio Units, Rack Effects) #358

Closed

mdjp added Clarification (Requires change) and removed Editorial/Documentation labels Oct 28, 2014

cwilso added this to the Web Audio Last Call 1 milestone Oct 30, 2014

cwilso added the status: needs discussion label Oct 26, 2015

cwilso changed the title ~~Layering considerations~~ TAG Issue: Layering considerations Oct 26, 2015

joeberkovitz modified the milestones: Web Audio v.next, Web Audio V1 Oct 26, 2015

cwilso removed status: needs discussion V1 (TPAC 2014) labels Oct 27, 2015

cwilso removed their assignment Oct 27, 2015

mdjp added the TAG label Sep 22, 2016

mdjp removed the TAG label Dec 15, 2016

svgeesus added the TAG label Nov 6, 2017

mdjp added the Theme - Layering label Mar 27, 2018

rtoy added this to To do in V2 Preparation (DO NOT USE) Apr 12, 2018

svgeesus added the Proposed v.next label Sep 16, 2019

padenot closed this as completed Sep 17, 2019

V2 Preparation (DO NOT USE) automation moved this from To do to Done Sep 17, 2019

plehegar added the w3c-tag-tracker Group bringing to attention of the TAG, or tracked by the TAG but not needing response. label Apr 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TAG Issue: Layering considerations #257

TAG Issue: Layering considerations #257

chrislo commented Oct 17, 2013

domenic commented Mar 30, 2014

cwilso commented Mar 31, 2014

cwilso commented Apr 17, 2014

joeberkovitz commented Jun 2, 2015

joeberkovitz commented Oct 8, 2015

joeberkovitz commented Oct 26, 2015

kirbysayshi commented Feb 10, 2017 •

edited

Loading

padenot commented Feb 13, 2017

kirbysayshi commented Feb 14, 2017

padenot commented Sep 17, 2019

TAG Issue: Layering considerations #257

TAG Issue: Layering considerations #257

Comments

chrislo commented Oct 17, 2013

Layering Considerations

domenic commented Mar 30, 2014

cwilso commented Mar 31, 2014

cwilso commented Apr 17, 2014

joeberkovitz commented Jun 2, 2015

joeberkovitz commented Oct 8, 2015

joeberkovitz commented Oct 26, 2015

kirbysayshi commented Feb 10, 2017 • edited Loading

padenot commented Feb 13, 2017

kirbysayshi commented Feb 14, 2017

padenot commented Sep 17, 2019

kirbysayshi commented Feb 10, 2017 •

edited

Loading