Difficult to address specific channel in ChannelMergerNode #304

Closed
tszynalski opened this Issue Mar 31, 2014 · 16 comments

Projects

None yet

7 participants

@tszynalski

Suppose you want to write a function that plays a mono sample on a given channel (left, right, center, etc.). The intuitive way of doing it would be to connect a BufferSourceNode with the mono sample to the appropriate input of a ChannelMergerNode. For example:

//connect to input #5 = right surround
bufferSource.connect(mergerNode, 0, 5);

Unfortunately, a ChannelMergerNode only takes into account connected inputs, so the above code would ignore the unconnected inputs 0-4, take the first connected input (#5), and output a mono signal.

The current solution is to do the following:

var silence = audioContext.createBufferSource();
//creating a dummy AudioBufferSourceNode which is treated as one channel of silence
silence.connect(mergerNode, 0, 0);
silence.connect(mergerNode, 0, 1);
silence.connect(mergerNode, 0, 2);
silence.connect(mergerNode, 0, 3);
silence.connect(mergerNode, 0, 4);
bufferSource.connect(mergerNode, 0, 5);

Note that when working with 5.1-channel audio, we have to connect 5 dummy silent nodes every time we want to play a sample on a single channel.

In addition, I believe there is nothing in the spec that prevents a browser from disconnecting and disposing of the dummy nodes as soon as the function exits and the silence variable is no longer referenced. This would cause the mergerNode to start outputting a mono signal, since it would receive only one input. So we would also have to make sure the "silence nodes" exist in a different (longer-lasting) scope than bufferSource.

It would be better to have the possibility of 1-1 mapping between ChannelMergerNode inputs and the channels in the output signal. We could tell a ChannelMergerNode what kind of output signal we want (mono, stereo, 5.1, etc.), connect a single input and the remaining inputs would be filled with silence.
It could work like this:

mergerNode.outputSignal = "5.1";
bufferSource.connect(mergerNode, 0, 5); //connect source to right surround channel

or

mergerNode.outputSignal = "5.1";
bufferSource.connect(mergerNode, 0, "SR");

The default setting of the "outputSignal" property could be "auto". This would correspond to the current way it works – automatically determine the number of channels by summing the channels on all connected inputs.
Setting outputSignal to "5.1" would have the following effects:

  • take the first 6 inputs (# 0-5)
  • interpret each input as mono (downmix if necessary)
  • interpret unconnected inputs as one channel of silence
  • mix input 0 into the left channel, input 1 into right, input 2 center, and so on.
  • ignore remaining inputs
@karlt
karlt commented Aug 21, 2014

On Mon, 31 Mar 2014 03:57:26 -0700, tszynalski wrote:

Suppose you want to write a function that plays a mono sample on a given
channel (left, right, center, etc.). The intuitive way of doing it would be to
connect a BufferSourceNode with the mono sample to the appropriate input of a
ChannelMergerNode. For example:

//connect to input #5 = right surround
bufferSource.connect(mergerNode, 0, 5);

Unfortunately, a ChannelMergerNode only takes into account connected
inputs
, so the above code would ignore the unconnected inputs 0-4, take the
first connected input (#5), and output a mono signal.

Yes, I find it unusual that inputs with no connections are treated
as having zero channels, where elsewhere the minimum number of
channels is usually one. "If the input has no connections then it
has one channel which is silent", though I assume "explicit"
channelCountMode is meant to override that requirement if
channelCount is greater than 1.

In addition, I believe there is nothing in the spec that prevents a browser
from disconnecting and disposing of the dummy nodes as soon as the function
exits and the silence variable is no longer referenced. This would cause the
mergerNode to start outputting a mono signal, since it would receive only one
input. So we would also have to make sure the "silence nodes" exist in a
different (longer-lasting) scope than bufferSource.

I believe that it would be a bug in implementations to make
garbage collection observable, but I know Gecko at least currently
has this bug.

It would be better to have the possibility of 1-1 mapping between
ChannelMergerNode inputs and the channels in the output signal. We could tell
a ChannelMergerNode what kind of output signal we want (mono, stereo, 5.1,
etc.), connect a single input and the remaining inputs would be filled with
silence.

Yes, a 1-1 mapping between inputs and output channels would be
clearer, and independent of dynamic connections and
disconnections.

If we can change the spec to make channelCountMode at least
default to "explicit" and channelCount default to 1, and include
disconnected inputs, then the number of output channels would
default to matching the numberOfInputs parameter to
createChannelMerger, and the default mapping would be 1-1.

See also http://lists.w3.org/Archives/Public/public-audio/2013JanMar/0141.html

@cwilso cwilso added this to the Web Audio Last Call 1 milestone Oct 29, 2014
@cwilso
Contributor
cwilso commented Oct 29, 2014

Not sure I know why merger works the way it does. I think we could define it as each input causes at least one channel (even if unconnected). That would not be totally backward compatible, though.

@cwilso cwilso changed the title from It's cumbersome to play a sound on a single channel to Difficult to address specific channel in ChannelMergerNode Oct 29, 2014
@cwilso
Contributor
cwilso commented Dec 4, 2014

WG agreed on 12/4/14 telecon that we should fix this: unconnected inputs should still cause extra channels, i.e. each input should cause at least one channel.

@cwilso cwilso self-assigned this Dec 4, 2014
@hoch
Member
hoch commented Dec 11, 2014

@padenot Is this reproducible in FireFox?

@hoch
Member
hoch commented Dec 12, 2014

Here is what I found in the merger node:

For every processing block, Merger always probes the connection status of input and copies channels from the active input into the output channel. If the connection is not active (disconnected, or the node reference is gone and GCed) then the merger will ignore the input and skip to the next available input.

Let's say you want to shoot 8 sine tone bursts for 8 speakers sequentially. Obviously you will want to create 8 oscillators and schedule them with start()/stop(). At this point, we have to consider that stop() will disconnect an oscillator with the merger. So when the first oscillator stops, its connection will be ignored by the merger and the second oscillator will be assigned to the first output channel. So forth, so on. As a result, you will hear all the sounds is coming from the first channel. No matter where you plug into the merger.

A current solution for this? You don't stop oscillators. Keep them running so they don't get ignored.

I don't find this behavior particularly useful, but I would love to hear what WG's opinion.

@hoch
Member
hoch commented Jan 12, 2015

Here is a proposal for ChannelMergerNode:

The following properties are fixed for ChannelMergerNode and InvalidState error should be thrown when user changes them:

ChannelMergerNode.channelCount = 1
ChannelMergerNode.channelCountMode = ‘explicit’

Changed behavior:

  • Each input will be summed to mono based on the specified mixing rule.
  • The summed mono channel from input will be transferred to the corresponding output channel (input_1 -> output_channel_1). If an input is not connected (inactive or disconnected), it still counts as one silent channel in output.
  • Note that the input to output channel mapping is fixed and won’t change the order of output channels dynamically.

Since this new behavior prevents ChannelMergerNode from dynamic channel reordering/resizing, any related issue can be resolved with this change. (such as #459)

@hoch
Member
hoch commented Jan 13, 2015

Note that the up/downmixing rule is only applied to each input when the channel layout is canonical (mono, stereo, quad, 5.0 and 5.1). For undefined channel layout, the mixing rule automatically changes to 'discrete' mode. That means the merger will just take the first channel and drop the rest.

@cwilso
Contributor
cwilso commented Jan 15, 2015

As a side note, this means that we could drop the dynamic channel
definition requirement for audio workers. Do we want to?

On Mon, Jan 12, 2015 at 4:09 PM, Hongchan Choi notifications@github.com
wrote:

Note that the up/downmixing rule is only applied to each input when the
channel layout is canonical (mono, stereo, quad, 5.0 and 5.1). For
undefined channel layout, the mixing rule automatically changes to
'discrete' mode. That means the merger will just take the first channel and
drop the rest.


Reply to this email directly or view it on GitHub
#304 (comment)
.

@hoch
Member
hoch commented Jan 15, 2015

@cwilso I second that. The dynamic (or so-called-intelligent) channel configuration is causing problems here and there. The behavior is unpredictable and difficult to maintain the state, etc. If the proposal above is accepted, the merger node will be robust and easy to control.

I would love to hear what other WG members think.

@hoch
Member
hoch commented Feb 5, 2015

@cwilso In the teleconference (2/5), @joeberkovitz agreed on this proposal but we would like to hear your opinion on this issue, especially in the regard of AudioWorker.

@padenot I was under the impression that you're not against the proposal. Please let me know if you have a different idea!

@padenot
Member
padenot commented Feb 6, 2015

I'm not against this proposal, I'm going to read it again and think
about it next week, though.

@joeberkovitz
Contributor

To clarify, my concern was that the proposed removal of dynamic channel configuration from AudioWorker might have potential side effects for AWs' ability to simulate other nodes for which this behavior is in fact expected and necessary, thus creating a problem for layering.

@cwilso
Contributor
cwilso commented Feb 6, 2015

It's fine. I'd already removed the ability for AWs to dynamically control
their number of inputs; the dynamic channel configuration is still possible
in AW (since it's necessary for other node replication, like
BufferSourceNode or Convolution). We should be good to go with Hongchan's
proposal.

On Fri, Feb 6, 2015 at 4:47 AM, Joe Berkovitz notifications@github.com
wrote:

To clarify, my concern was that the proposed removal of dynamic channel
configuration from AudioWorker might have potential side effects for AWs'
ability to simulate other nodes for which this behavior is in fact expected
and necessary, thus creating a problem for layering.


Reply to this email directly or view it on GitHub
#304 (comment)
.

@hoch
Member
hoch commented Feb 6, 2015

As @cwilso mentioned, the dynamic channel configuration between nodes (or downstream propagation) is not the problem we're trying to fix here. The problem is the unintuitive behavior of channel stacking in ChannelMergerNode.

@mdjp
Member
mdjp commented Mar 6, 2015

discussed on a WG teleconf 5th March 2015 - the group agrees with the proposal and the issue has been marked ready for editing.

@dstockwell dstockwell pushed a commit to dstockwell/blink that referenced this issue Mar 11, 2015
@hoch hoch Fixing input handling behavior of ChannelMergerNode
The current implementation of ChannelMergerNode has two critical issues:
1. Impossible to specify input-output mapping due to dynamic channel re-arranging.
  WebAudio/web-audio-api#304, crbug.com/441451
2. ChannelMergerNode connected with DelayNode in a cyclic audiograph causes a silent failure of audio rendering.  
  WebAudio/web-audio-api#459, crbug.com/442925

To address these issues, rtoy@, cwilso@ and hongchan@ proposed a new behavior of ChannelMergerNode:

1. The following properties are fixed for ChannelMergerNode and InvalidState error should be thrown when user changes them.
  - ChannelMergerNode.channelCount = 1
  - ChannelMergerNode.channelCountMode = ‘explicit’
2. Each input will be summed to mono based on the specified mixing rule.
3. The summed mono channel from input will be transferred to the corresponding output channel (input_1 -> output_channel_1). If an input is not connected (inactive or disconnected), it still counts as one silent channel in output.

Note that the up/downmixing rule is only applied to each input when the channel layout is canonical (mono, stereo, quad, 5.0 and 5.1). For undefined channel layout, the mixing rule automatically changes to 'discrete' mode. That means the merger will just take the first channel and drop the rest.

BUG=441451, 442925
TESTS=4 new tests
audiochannelmerger-cycle.html: to cover the bug 442925
audiochannelmerger-disconnect.html: to cover the bug 441451
audiochannelmerger-input.html: to cover the bug 441451
audiochannelmerger-input-non-default.html: to test the merger with non-default setting

Note that the following laytout test files were removed (or renamed) and merged into the new tests:
channelmerger-basic.html => audiochannelmerger-basic.html
channelmerger-input-handling.html => audiochannelmerger-input.html
channelmerger-cycle.html => audiochannelmerger-cycle.html
channelmerger-channel-limit.html => audiochannelmerger-basic.html
audiochannelmerger-stereo.html => audiochannelmerger-input.html

Review URL: https://codereview.chromium.org/808653010

git-svn-id: svn://svn.chromium.org/blink/trunk@191723 bbb929c8-8fbe-4397-9dbb-9b2b20218538
066e3d3
@cwilso
Contributor
cwilso commented Jun 2, 2015

Fixed.

@cwilso cwilso closed this Jun 2, 2015
@dstockwell dstockwell pushed a commit to dstockwell/chromium that referenced this issue Sep 23, 2015
@hoch hoch Fixing input handling behavior of ChannelMergerNode
The current implementation of ChannelMergerNode has two critical issues:
1. Impossible to specify input-output mapping due to dynamic channel re-arranging.
  WebAudio/web-audio-api#304, crbug.com/441451
2. ChannelMergerNode connected with DelayNode in a cyclic audiograph causes a silent failure of audio rendering.  
  WebAudio/web-audio-api#459, crbug.com/442925

To address these issues, rtoy@, cwilso@ and hongchan@ proposed a new behavior of ChannelMergerNode:

1. The following properties are fixed for ChannelMergerNode and InvalidState error should be thrown when user changes them.
  - ChannelMergerNode.channelCount = 1
  - ChannelMergerNode.channelCountMode = ‘explicit’
2. Each input will be summed to mono based on the specified mixing rule.
3. The summed mono channel from input will be transferred to the corresponding output channel (input_1 -> output_channel_1). If an input is not connected (inactive or disconnected), it still counts as one silent channel in output.

Note that the up/downmixing rule is only applied to each input when the channel layout is canonical (mono, stereo, quad, 5.0 and 5.1). For undefined channel layout, the mixing rule automatically changes to 'discrete' mode. That means the merger will just take the first channel and drop the rest.

BUG=441451, 442925
TESTS=4 new tests
audiochannelmerger-cycle.html: to cover the bug 442925
audiochannelmerger-disconnect.html: to cover the bug 441451
audiochannelmerger-input.html: to cover the bug 441451
audiochannelmerger-input-non-default.html: to test the merger with non-default setting

Note that the following laytout test files were removed (or renamed) and merged into the new tests:
channelmerger-basic.html => audiochannelmerger-basic.html
channelmerger-input-handling.html => audiochannelmerger-input.html
channelmerger-cycle.html => audiochannelmerger-cycle.html
channelmerger-channel-limit.html => audiochannelmerger-basic.html
audiochannelmerger-stereo.html => audiochannelmerger-input.html

Review URL: https://codereview.chromium.org/808653010

git-svn-id: svn://svn.chromium.org/blink/trunk@191723 bbb929c8-8fbe-4397-9dbb-9b2b20218538
cae9b2b
@hwine hwine pushed a commit to mozilla/gecko-dev that referenced this issue Nov 4, 2015
Karl Tomlinson bug 1094925 adopt spec changes for one channel per input ChannelMerge…
…r behavior r=padenot


WebAudio/web-audio-api#304

NotSupportedError is chosen for more sensible meaning and consistency with
other nodes.

--HG--
extra : rebase_source : a5b8b8af0aeb3751d299b5fe785afb9a99fe5dea
68c0558
@eamsen eamsen pushed a commit to eamsen/gecko-dev that referenced this issue Nov 10, 2015
Karl Tomlinson bug 1094925 adopt spec changes for one channel per input ChannelMerge…
…r behavior r=padenot


WebAudio/web-audio-api#304

NotSupportedError is chosen for more sensible meaning and consistency with
other nodes.
5060069
@jryans jryans pushed a commit to jryans/gecko-dev that referenced this issue Dec 14, 2015
Karl Tomlinson bug 1094925 adopt spec changes for one channel per input ChannelMerge…
…r behavior r=padenot


WebAudio/web-audio-api#304

NotSupportedError is chosen for more sensible meaning and consistency with
other nodes.
266ba7b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment