AudioFrequencyWorkerNode for working in the Frequency Domain #468

Open
hughrawlinson opened this Issue Jan 28, 2015 · 95 comments

Comments

@hughrawlinson

At the moment, ScriptProcessorNodes and AudioWorkers are operating on the time domain buffer data. At the Web Audio Conference, it seems like there's demand for frequency domain data inside a callback that's going to get called for every audio frame.

We're thus proposing an new node called an AudioFrequencyWorkerNode, which gives the developer the option to obtain audio data, perform processing, and output in the frequency domain. This involves passing an options object to the createAudioFrequencyWorker method, specifying the input and output types.

Defaults

The AudioFrequencyWorkerNode should allow access to time domain and frequency domain data concurrently. If no options object is passed to createAudioFrequencyWorker, the input type would default to the amplitude/phase pair, as would the output. However, the options object would allow the user to choose between amplitude/phase, real/imaginary, and time domain data. The dataOut would default to the same as the dataIn, but could be set to a different data type, in case the user wants to read real/imaginary pairs in, and write out to the time domain for example.

Proposed processing structure of the AudioFrequencyWorkerNode

INPUT (time-domain)

windowing

FFT

~ ~ ~ ~ ~ ~ ~ ~ ~
dataIn

onaudioprocess

dataOut
~ ~ ~ ~ ~ ~ ~ ~ ~

mirror

complete data

IFFT

windowing

OUTPUT (time-domain)

Example code:

main JS

var aw = createAudioFrequencyWorker("worker.js",{
        dataIn:[
            "amplitude",
            "phase",
            "real",
            "imaginary",
            "signal"
        ],
        dataOut:"complex" || "amplitude" || "phase" || "signal",
        bufferSize:[power of two],
        hopSize:N,
        windowingType,
        zeroPadding:N
    }
// Signal (time domain) would be the default I/O for the AudioFrequencyWorker.
// indicating another datatype would actuate an fft and/or ifft around the user code.

AudioWorker JS

// callback to AudioFrequencyWorkerNode
onaudioprocess = function (e) {
  // e.amplitude[0][channel];
  // e.phase[0][channel];
  // e.real[0][channel];
  // e.imaginary[0][channel];
  // e.signal[0][channel]; // replacing the 'input', time domain.

  // edit the arrays in place, rather than having a separate output array to copy into

  for (var channel = 0; channel<e.amplitude.length;channel++){
    for(var i = 0; i < e.amplitude.length;i++){
        e.amplitude[i][channel] = e.amplitude[i][channel]>0.5 ? e.amplitude[i][channel] : 0;
    }
  }
};

Use Cases

  • Feature Extraction,
  • Phase Vocoding,
  • Pitch Shifting,
  • Pitch Track,
  • Time stretching,
  • Frequency domain gating effects

Jesse Allison, Hugh Rawlinson, Jakub Fiala, Nevo Segal
@jesseallison, @hughrawlinson, @jakubfiala, @nevosegal

Related

#248
#262

@jesseallison

This comment has been minimized.

Show comment
Hide comment

👍

@jakubfiala

This comment has been minimized.

Show comment
Hide comment

🎱

@nevosegal

This comment has been minimized.

Show comment
Hide comment

👍

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@jesseallison

This comment has been minimized.

Show comment
Hide comment
@jesseallison

jesseallison Jan 28, 2015

An FFT Bin Modulation Example:

A common audio process of changing the gain on any bin. These could be done through a array of bin modulation values, processed as a whole or on a bin by bin basis.

main JS:

var bufferSize = 2048;
var binCount = bufferSize/2;

var binModNode = createAudioFrequencyWorker("binAmplitudeModulator.js",{
        dataIn: "amplitude",
        dataOut: "signal",
                bufferSize: bufferSize
    }

// binModNode.frequencyBinScalingArray[];  // Assuming parameters could be accessible directly upon instantiation... this may have to be declared.
var cutoffBin = 22;

// creating a cutoff frequency
for(var i=0; i < binCount;i++){
    var scale = (i<cutoffBin) ? 0 : 1;
    binModNode.frequencyBinScalingArray[i] = scale;
}

// adjusting a single bin
binModNode.frequencyBinScalingArray[5] = 2.5;

AudioWorker JS:

// callback to AudioWorkerNode
onaudioprocess = function (e) {
    var channelCount = e.amplitude[0].length;
  var bufferLength = e.amplitude.length;        // This could possibly be automatically generated in the AudioFrequencyWorkerNode

  var frequencyBinScalingArray = e.parameters.frequencyBinScalingArray;

  for (var channel = 0; channel<channelCount;channel++){
    for(var i = 0; i < bufferLength;i++){
        e.amplitude[i][channel] = e.amplitude[i][channel] * frequencyBinScalingArray[i];
    }
  }
};

An FFT Bin Modulation Example:

A common audio process of changing the gain on any bin. These could be done through a array of bin modulation values, processed as a whole or on a bin by bin basis.

main JS:

var bufferSize = 2048;
var binCount = bufferSize/2;

var binModNode = createAudioFrequencyWorker("binAmplitudeModulator.js",{
        dataIn: "amplitude",
        dataOut: "signal",
                bufferSize: bufferSize
    }

// binModNode.frequencyBinScalingArray[];  // Assuming parameters could be accessible directly upon instantiation... this may have to be declared.
var cutoffBin = 22;

// creating a cutoff frequency
for(var i=0; i < binCount;i++){
    var scale = (i<cutoffBin) ? 0 : 1;
    binModNode.frequencyBinScalingArray[i] = scale;
}

// adjusting a single bin
binModNode.frequencyBinScalingArray[5] = 2.5;

AudioWorker JS:

// callback to AudioWorkerNode
onaudioprocess = function (e) {
    var channelCount = e.amplitude[0].length;
  var bufferLength = e.amplitude.length;        // This could possibly be automatically generated in the AudioFrequencyWorkerNode

  var frequencyBinScalingArray = e.parameters.frequencyBinScalingArray;

  for (var channel = 0; channel<channelCount;channel++){
    for(var i = 0; i < bufferLength;i++){
        e.amplitude[i][channel] = e.amplitude[i][channel] * frequencyBinScalingArray[i];
    }
  }
};
@svgeesus

This comment has been minimized.

Show comment
Hide comment
@svgeesus

svgeesus Jan 28, 2015

Contributor

Excellent. I called for the folks who wanted this to write it up at todays Audio WG panel at WAC, and here it is the same day. Ws cool to see it develop on the piratepad too.

Contributor

svgeesus commented Jan 28, 2015

Excellent. I called for the folks who wanted this to write it up at todays Audio WG panel at WAC, and here it is the same day. Ws cool to see it develop on the piratepad too.

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Jan 28, 2015

Contributor

I fail to see what this node provides that an AudioWorker node does not.

I also find it confusing if the AudioFrequencyWorkerNode outputs a
frequency domain signal and is then connected to other nodes. Do you then
just get random garbage in and out?

I think this also raises the question of what is WebAudio? Is it intended
to be a general purpose signal processing package where you can do whatever
you want? (Modulo AudioWorkers, where you can do whatever you want, as long
as the output is an audio signal. Yes, you can abuse that too, but that's
not the intent.)

On Wed, Jan 28, 2015 at 10:41 AM, Nantonos notifications@github.com wrote:

Excellent. I called for the folks who wanted this to write it up at todays
Audio WG panel at WAC, and here it is the same day. Ws cool to see it
develop on the piratepad too.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented Jan 28, 2015

I fail to see what this node provides that an AudioWorker node does not.

I also find it confusing if the AudioFrequencyWorkerNode outputs a
frequency domain signal and is then connected to other nodes. Do you then
just get random garbage in and out?

I think this also raises the question of what is WebAudio? Is it intended
to be a general purpose signal processing package where you can do whatever
you want? (Modulo AudioWorkers, where you can do whatever you want, as long
as the output is an audio signal. Yes, you can abuse that too, but that's
not the intent.)

On Wed, Jan 28, 2015 at 10:41 AM, Nantonos notifications@github.com wrote:

Excellent. I called for the folks who wanted this to write it up at todays
Audio WG panel at WAC, and here it is the same day. Ws cool to see it
develop on the piratepad too.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@hughrawlinson

This comment has been minimized.

Show comment
Hide comment
@hughrawlinson

hughrawlinson Jan 28, 2015

Hi @rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available in the AudioWorkerNode. It allows programmers (and composers, game designers, etc. etc. etc.) the ability to manipulate data in the frequency domain without all the prerequisite DSP knowledge necessary to accomplish similar tasks in the time domain.

In response to your second point, I should clarify the processing structure of the AudioFrequencyWorkerNode. The node acts like any other node in that it both accepts input and provides output in the time domain. If you look at the little ascii diagram in the original issue, 'INPUT' and 'OUTPUT' refer to the input and outputs of the node while 'dataIn' and 'dataOut' are the frequency domain arrays that are accessible inside the callback of the AudioFrequencyWorkerNode. The node would take care of the transformation between time and frequency both on the way in to the callback and on the way out of it.

I'm not quite sure what you're getting at with your third point, would you mind clarifying?

Hi @rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available in the AudioWorkerNode. It allows programmers (and composers, game designers, etc. etc. etc.) the ability to manipulate data in the frequency domain without all the prerequisite DSP knowledge necessary to accomplish similar tasks in the time domain.

In response to your second point, I should clarify the processing structure of the AudioFrequencyWorkerNode. The node acts like any other node in that it both accepts input and provides output in the time domain. If you look at the little ascii diagram in the original issue, 'INPUT' and 'OUTPUT' refer to the input and outputs of the node while 'dataIn' and 'dataOut' are the frequency domain arrays that are accessible inside the callback of the AudioFrequencyWorkerNode. The node would take care of the transformation between time and frequency both on the way in to the callback and on the way out of it.

I'm not quite sure what you're getting at with your third point, would you mind clarifying?

@hoch

This comment has been minimized.

Show comment
Hide comment
@hoch

hoch Jan 28, 2015

Member

all the prerequisite DSP knowledge

I guess the prerequisite here is being able to write FFT/IFFT in AudioWorker. Besides, you still need to have DSP knowledge to manipulate mag/phase properly - doing FFT/IFFT by native code doesn't necessarily reduce the amount of knowledge you need.

Here are more nitpicks:

  1. Are you proposing this because you assume FFT/IFFT with JS in AudioWorker will be too slow for realtime applications?
  2. Probably browser vendors will not have the identical FFT/IFFT implementation, so what you will have in the node after FFT might vary on different browsers or platforms. (different codec or etc.) Are you okay with that? Doing FFT/IFFT with optimized JS code in AudioWorker will not have that kind of problem.
Member

hoch commented Jan 28, 2015

all the prerequisite DSP knowledge

I guess the prerequisite here is being able to write FFT/IFFT in AudioWorker. Besides, you still need to have DSP knowledge to manipulate mag/phase properly - doing FFT/IFFT by native code doesn't necessarily reduce the amount of knowledge you need.

Here are more nitpicks:

  1. Are you proposing this because you assume FFT/IFFT with JS in AudioWorker will be too slow for realtime applications?
  2. Probably browser vendors will not have the identical FFT/IFFT implementation, so what you will have in the node after FFT might vary on different browsers or platforms. (different codec or etc.) Are you okay with that? Doing FFT/IFFT with optimized JS code in AudioWorker will not have that kind of problem.
@adelespinasse

This comment has been minimized.

Show comment
Hide comment
@adelespinasse

adelespinasse Jan 28, 2015

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.? That would let you do everything this proposal lets you do (unless I'm missing something), and also lots of other stuff.

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.? That would let you do everything this proposal lets you do (unless I'm missing something), and also lots of other stuff.

@hughrawlinson

This comment has been minimized.

Show comment
Hide comment
@hughrawlinson

hughrawlinson Jan 28, 2015

There are other reasons that the AudioWorkerNode isn't suited to dealing with frequency domain data. The buffer size is set at 128 samples, which when converted to the frequency domain gives you bins that are ~345hz wide (if my maths are right). Not exactly ideal. This is to fulfil the design goal that the AudioWorkerNode doesn't introduce any latency into the audio graph. For many purposes in the frequency domain, this isn't ideal.

Probably browser vendors will not have the identical FFT/IFFT implementation

That sounds like something that should be specified... I'm very surprised to see that it seems kind of ambiguous in the spec for the AnalyserNode...

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

There are other reasons that the AudioWorkerNode isn't suited to dealing with frequency domain data. The buffer size is set at 128 samples, which when converted to the frequency domain gives you bins that are ~345hz wide (if my maths are right). Not exactly ideal. This is to fulfil the design goal that the AudioWorkerNode doesn't introduce any latency into the audio graph. For many purposes in the frequency domain, this isn't ideal.

Probably browser vendors will not have the identical FFT/IFFT implementation

That sounds like something that should be specified... I'm very surprised to see that it seems kind of ambiguous in the spec for the AnalyserNode...

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Jan 28, 2015

Contributor

On Wed, Jan 28, 2015 at 3:23 PM, Hugh Rawlinson notifications@github.com
wrote:

There are other reasons that the AudioWorkerNode isn't suited to dealing
with frequency domain data. The buffer size is set at 128 samples, which
when converted to the frequency domain gives you bins that are ~345hz wide
(if my maths are right). Not exactly ideal. This is to fulfil the design
goal that the AudioWorkerNode doesn't introduce any latency into the audio
graph. For many purposes in the frequency domain, this isn't ideal.

​You have no choice on the buffer size. Nodes always get blocks of 128
frames. You have have to buffer internally if you want to process on larger
chunks. (This buffering is kind of hidden for ScriptProcessorNodes where
the buffering is done for you and the node gets larger buffers all at
once.)​

Probably browser vendors will not have the identical FFT/IFFT
implementation

That sounds like something that should be specified... I'm very surprised
to see that it seems kind of ambiguous in the spec for the AnalyserNode
http://webaudio.github.io/web-audio-api/#the-analysernode-interface...

​Oops. That's a bug and we should specify precisely what the FFT is. (I
know of at least 2 ways of defining the forward part, and at least 3 ways
to scale it.)​

I'm pretty sure, however, that currently everyone does it the same way, and
any differences are rounding errors depending on the exact FFT algorithm
used.

A good set of general-purpose DSP operations is being proposed, in the Web
Array Math API http://opera-mage.github.io/webarraymath/ whose status I
don't know or understand.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented Jan 28, 2015

On Wed, Jan 28, 2015 at 3:23 PM, Hugh Rawlinson notifications@github.com
wrote:

There are other reasons that the AudioWorkerNode isn't suited to dealing
with frequency domain data. The buffer size is set at 128 samples, which
when converted to the frequency domain gives you bins that are ~345hz wide
(if my maths are right). Not exactly ideal. This is to fulfil the design
goal that the AudioWorkerNode doesn't introduce any latency into the audio
graph. For many purposes in the frequency domain, this isn't ideal.

​You have no choice on the buffer size. Nodes always get blocks of 128
frames. You have have to buffer internally if you want to process on larger
chunks. (This buffering is kind of hidden for ScriptProcessorNodes where
the buffering is done for you and the node gets larger buffers all at
once.)​

Probably browser vendors will not have the identical FFT/IFFT
implementation

That sounds like something that should be specified... I'm very surprised
to see that it seems kind of ambiguous in the spec for the AnalyserNode
http://webaudio.github.io/web-audio-api/#the-analysernode-interface...

​Oops. That's a bug and we should specify precisely what the FFT is. (I
know of at least 2 ways of defining the forward part, and at least 3 ways
to scale it.)​

I'm pretty sure, however, that currently everyone does it the same way, and
any differences are rounding errors depending on the exact FFT algorithm
used.

A good set of general-purpose DSP operations is being proposed, in the Web
Array Math API http://opera-mage.github.io/webarraymath/ whose status I
don't know or understand.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Jan 28, 2015

Contributor

On Wed, Jan 28, 2015 at 2:34 PM, Hugh Rawlinson notifications@github.com
wrote:

Hi @rtoy https://github.com/rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available
in the AudioWorkerNode. It allows programmers (and composers, game
designers, etc. etc. etc.) the ability to manipulate data in the frequency
domain without all the prerequisite DSP knowledge necessary to accomplish
similar tasks in the time domain.

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already. It's easy enough for an
AudioWorkerNode to do an FFT internally using any of the available JS FFT
libraries out there.​

In response to your second point, I should clarify the processing
structure of the AudioFrequencyWorkerNode. The node acts like any other
node in that it both accepts input and provides output in the time domain.
If you look at the little ascii diagram in the original issue, 'INPUT' and
'OUTPUT' refer to the input and outputs of the node while 'dataIn' and
'dataOut' are the frequency domain arrays that are accessible inside the
callback of the AudioFrequencyWorkerNode. The node would take care of the
transformation between time and frequency both on the way in to the
callback and on the way out of it.

​Ah, thanks for the clarification. Audio in and out makes much more sense.​

I'm not quite sure what you're getting at with your third point, would you
mind clarifying?

​Basically, what kind of native nodes should WebAudio supply? ​

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented Jan 28, 2015

On Wed, Jan 28, 2015 at 2:34 PM, Hugh Rawlinson notifications@github.com
wrote:

Hi @rtoy https://github.com/rtoy,

The AudioFrequencyWorkerNode provides a lot of functionality not available
in the AudioWorkerNode. It allows programmers (and composers, game
designers, etc. etc. etc.) the ability to manipulate data in the frequency
domain without all the prerequisite DSP knowledge necessary to accomplish
similar tasks in the time domain.

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already. It's easy enough for an
AudioWorkerNode to do an FFT internally using any of the available JS FFT
libraries out there.​

In response to your second point, I should clarify the processing
structure of the AudioFrequencyWorkerNode. The node acts like any other
node in that it both accepts input and provides output in the time domain.
If you look at the little ascii diagram in the original issue, 'INPUT' and
'OUTPUT' refer to the input and outputs of the node while 'dataIn' and
'dataOut' are the frequency domain arrays that are accessible inside the
callback of the AudioFrequencyWorkerNode. The node would take care of the
transformation between time and frequency both on the way in to the
callback and on the way out of it.

​Ah, thanks for the clarification. Audio in and out makes much more sense.​

I'm not quite sure what you're getting at with your third point, would you
mind clarifying?

​Basically, what kind of native nodes should WebAudio supply? ​

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Jan 29, 2015

Contributor

Aren't there JS libraries out there already to do these kinds of things? I
don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse <notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose DSP
operations for Float32Array, including FFT, inverse FFT, windowing, etc.?
That would let you do everything this proposal lets you do (unless I'm
missing something), and also lots of other stuff.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented Jan 29, 2015

Aren't there JS libraries out there already to do these kinds of things? I
don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse <notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose DSP
operations for Float32Array, including FFT, inverse FFT, windowing, etc.?
That would let you do everything this proposal lets you do (unless I'm
missing something), and also lots of other stuff.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@sebpiq

This comment has been minimized.

Show comment
Hide comment
@sebpiq

sebpiq Jan 29, 2015

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already

With that reasoning most of the Web Audio API doesn't make sense does it ;)

Even with fair knowledge of DSP, there is a some not-completely trivial
stuff there such as overlapping windows and so on.

On Thu, Jan 29, 2015 at 1:00 AM, rtoy notifications@github.com wrote:

Aren't there JS libraries out there already to do these kinds of things? I
don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse <
notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose
DSP
operations for Float32Array, including FFT, inverse FFT, windowing,
etc.?
That would let you do everything this proposal lets you do (unless I'm
missing something), and also lots of other stuff.


Reply to this email directly or view it on GitHub
<
https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71935694>

.

Ray


Reply to this email directly or view it on GitHub
#468 (comment)
.

Sébastien Piquemal

-----* @sebpiq*
----- http://github.com/sebpiq
----- http://funktion.fm

sebpiq commented Jan 29, 2015

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already

With that reasoning most of the Web Audio API doesn't make sense does it ;)

Even with fair knowledge of DSP, there is a some not-completely trivial
stuff there such as overlapping windows and so on.

On Thu, Jan 29, 2015 at 1:00 AM, rtoy notifications@github.com wrote:

Aren't there JS libraries out there already to do these kinds of things? I
don't see that this falls under WebAudio's goals.

On Wed, Jan 28, 2015 at 3:04 PM, Alan deLespinasse <
notifications@github.com

wrote:

How about, instead of this, just provide a good set of general-purpose
DSP
operations for Float32Array, including FFT, inverse FFT, windowing,
etc.?
That would let you do everything this proposal lets you do (unless I'm
missing something), and also lots of other stuff.


Reply to this email directly or view it on GitHub
<
https://github.com/WebAudio/web-audio-api/issues/468#issuecomment-71935694>

.

Ray


Reply to this email directly or view it on GitHub
#468 (comment)
.

Sébastien Piquemal

-----* @sebpiq*
----- http://github.com/sebpiq
----- http://funktion.fm

@hoch

This comment has been minimized.

Show comment
Hide comment
@hoch

hoch Jan 29, 2015

Member

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

Member

hoch commented Jan 29, 2015

How about, instead of this, just provide a good set of general-purpose DSP operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web Array Math API whose status I don't know or understand.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

@sebpiq

This comment has been minimized.

Show comment
Hide comment
@sebpiq

sebpiq Jan 29, 2015

Yes... web array math API sounded great but I haven't heard any news for a
good 6 months now :(

On Thu, Jan 29, 2015 at 1:07 AM, Hongchan Choi notifications@github.com
wrote:

How about, instead of this, just provide a good set of general-purpose DSP
operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web
Array Math API whose status I don't know or understand.

@adelespinasse https://github.com/adelespinasse @hughrawlinson
https://github.com/hughrawlinson That seems to be a nice complement for
AudioWorker.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Sébastien Piquemal

-----* @sebpiq*
----- http://github.com/sebpiq
----- http://funktion.fm

sebpiq commented Jan 29, 2015

Yes... web array math API sounded great but I haven't heard any news for a
good 6 months now :(

On Thu, Jan 29, 2015 at 1:07 AM, Hongchan Choi notifications@github.com
wrote:

How about, instead of this, just provide a good set of general-purpose DSP
operations for Float32Array, including FFT, inverse FFT, windowing, etc.?

A good set of general-purpose DSP operations is being proposed, in the Web
Array Math API whose status I don't know or understand.

@adelespinasse https://github.com/adelespinasse @hughrawlinson
https://github.com/hughrawlinson That seems to be a nice complement for
AudioWorker.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Sébastien Piquemal

-----* @sebpiq*
----- http://github.com/sebpiq
----- http://funktion.fm

@hughrawlinson

This comment has been minimized.

Show comment
Hide comment
@hughrawlinson

hughrawlinson Jan 29, 2015

I agree with @sebpiq, the ability to do something in AudioWorker doesn't negate the need for a node. You could implement every single node in the spec with AudioWorker, if that's the reasoning then why not just make people implement their own Oscillators with AudioWorker rather than supply OscillatorNode?

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP technique. I do however think it's good to be able to work in both the time and frequency domains, and AudioWorker caters for the time domain, so why not have something that caters for the frequency domain? If you want to optimise for as few nodes as possible, then surely we should all be implementing our own oscillators in AudioWorkers rather than using OscillatorNodes, I don't really think having fewer nodes is a great design goal. Having general purpose nodes though, is, and I think AudioFrequencyWorker is general enough to warrant being a node.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside of AudioWorker, but the spec doesn't seem to be progressing, and as it's a separate spec it may not be ready for years even after the Web Audio API is released.

I agree with @sebpiq, the ability to do something in AudioWorker doesn't negate the need for a node. You could implement every single node in the spec with AudioWorker, if that's the reasoning then why not just make people implement their own Oscillators with AudioWorker rather than supply OscillatorNode?

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP technique. I do however think it's good to be able to work in both the time and frequency domains, and AudioWorker caters for the time domain, so why not have something that caters for the frequency domain? If you want to optimise for as few nodes as possible, then surely we should all be implementing our own oscillators in AudioWorkers rather than using OscillatorNodes, I don't really think having fewer nodes is a great design goal. Having general purpose nodes though, is, and I think AudioFrequencyWorker is general enough to warrant being a node.

@adelespinasse @hughrawlinson That seems to be a nice complement for AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside of AudioWorker, but the spec doesn't seem to be progressing, and as it's a separate spec it may not be ready for years even after the Web Audio API is released.

@jakubfiala

This comment has been minimized.

Show comment
Hide comment
@jakubfiala

jakubfiala Jan 29, 2015

@rtoy So the difference between the knowledge necessary to, say, do basic spectral modulation in your AudioWorker, and performing a highly optimized Fast Fourier Transform is quite vast. The main motivation behind this proposal is really that the majority of Web Audio developers aren't able to do that easily, me being one of them. Given that it's one of the fundaments of audio DSP and pretty much everybody in the field gets to use the FFT spectrum at some point, I think it's more than fitting to have a really damn good, as well as really damn fast version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just an attempt to make it useful for more than just simply visualizing the spectrum, which is pretty much all that the AnalyserNode is good for. Actually, in this sense I can take your argument even further, and say – if there already is a native FFT in Web Audio, why should we have people implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal AudioWorker (so no extra nodes), but after a consultation with Paul Adenot we realised it would actually only work as a separate node mainly because of the fixed buffer size constraint.

@rtoy So the difference between the knowledge necessary to, say, do basic spectral modulation in your AudioWorker, and performing a highly optimized Fast Fourier Transform is quite vast. The main motivation behind this proposal is really that the majority of Web Audio developers aren't able to do that easily, me being one of them. Given that it's one of the fundaments of audio DSP and pretty much everybody in the field gets to use the FFT spectrum at some point, I think it's more than fitting to have a really damn good, as well as really damn fast version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just an attempt to make it useful for more than just simply visualizing the spectrum, which is pretty much all that the AnalyserNode is good for. Actually, in this sense I can take your argument even further, and say – if there already is a native FFT in Web Audio, why should we have people implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal AudioWorker (so no extra nodes), but after a consultation with Paul Adenot we realised it would actually only work as a separate node mainly because of the fixed buffer size constraint.

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso Jan 29, 2015

Contributor

If you're looking for an FFT that produces sound - a frequency-domain
transformer - then yes, obvious the Analyser is not for you.

My personal take on this is that this would likely be better served by a
separate FFT library, and someone dealing with the array aggregation in
interesting ways. (We could provide a "buffering node" that enabled
processing of large blocks with corresponding latency - but I'm not
personally convinced quite enough that it's necessary (=hard/nonperformant
enough to do yourself). At any rate, this is DEFINITELY separate from the
straight AudioWorker.

That said - I'll defer to the WG, but I don't think this is a v1 feature.
You can, for the moment, build the buffering semantic yourself in a worker,
and the FFT in JS (or use Web Array Math if there's an implementation).

On Thu, Jan 29, 2015 at 12:24 PM, Jakub Fiala notifications@github.com
wrote:

@rtoy https://github.com/rtoy So the difference between the knowledge
necessary to, say, do amplitude modulation in your AudioWorker, and
performing a highly optimized Fast Fourier Transform is quite vast. The
main motivation behind this proposal is really that the majority of Web
Audio developers aren't able to do that easily, me being one of them. Given
that it's one of the fundaments of audio DSP and pretty much everybody in
the field gets to use the FFT spectrum at some point, I think it's more
than fitting to have a really damn good, as well as really damn fast
version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just
an attempt to make it useful for more than just simply visualizing the
spectrum, which is pretty much all that the AnalyserNode is good for.
Actually, in this sense I can take your argument even further, and say – if
there already is a native FFT in Web Audio, why should we have people
implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal
AudioWorker (so no extra nodes), but after a consultation with Paul Adenot
we realised it would actually only work as a separate node mainly because
of the fixed buffer size constraint.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Contributor

cwilso commented Jan 29, 2015

If you're looking for an FFT that produces sound - a frequency-domain
transformer - then yes, obvious the Analyser is not for you.

My personal take on this is that this would likely be better served by a
separate FFT library, and someone dealing with the array aggregation in
interesting ways. (We could provide a "buffering node" that enabled
processing of large blocks with corresponding latency - but I'm not
personally convinced quite enough that it's necessary (=hard/nonperformant
enough to do yourself). At any rate, this is DEFINITELY separate from the
straight AudioWorker.

That said - I'll defer to the WG, but I don't think this is a v1 feature.
You can, for the moment, build the buffering semantic yourself in a worker,
and the FFT in JS (or use Web Array Math if there's an implementation).

On Thu, Jan 29, 2015 at 12:24 PM, Jakub Fiala notifications@github.com
wrote:

@rtoy https://github.com/rtoy So the difference between the knowledge
necessary to, say, do amplitude modulation in your AudioWorker, and
performing a highly optimized Fast Fourier Transform is quite vast. The
main motivation behind this proposal is really that the majority of Web
Audio developers aren't able to do that easily, me being one of them. Given
that it's one of the fundaments of audio DSP and pretty much everybody in
the field gets to use the FFT spectrum at some point, I think it's more
than fitting to have a really damn good, as well as really damn fast
version of it in WA.

Not to mention that we already do have an FFT in WA, and this is just
an attempt to make it useful for more than just simply visualizing the
spectrum, which is pretty much all that the AnalyserNode is good for.
Actually, in this sense I can take your argument even further, and say – if
there already is a native FFT in Web Audio, why should we have people
implementing it again in JS? Isn't that just horribly inefficient?

Mind you, we originally envisaged this as an extension to the normal
AudioWorker (so no extra nodes), but after a consultation with Paul Adenot
we realised it would actually only work as a separate node mainly because
of the fixed buffer size constraint.


Reply to this email directly or view it on GitHub
#468 (comment)
.

@chrislo

This comment has been minimized.

Show comment
Hide comment
@chrislo

chrislo Jan 29, 2015

Member

I think @cwilso's point is a very valid one. I think the existence of AudioWorker and presumably the proliferation of libraries that will then be developed to handle the buffering/aggregation and (I)FFT maths will make it very clear to us over time what a future AudioFrequencyWorker's interface should look like.

Exposing all of the buffering parameters, as well as those of the FFT algorithm (specifying overlap, window functions, scaling etc) is going to be a hard thing to get right - I imagine a consensus on this might develop over time in user code that can be perhaps incorporated into the API later for performance / ease of use reasons.

Member

chrislo commented Jan 29, 2015

I think @cwilso's point is a very valid one. I think the existence of AudioWorker and presumably the proliferation of libraries that will then be developed to handle the buffering/aggregation and (I)FFT maths will make it very clear to us over time what a future AudioFrequencyWorker's interface should look like.

Exposing all of the buffering parameters, as well as those of the FFT algorithm (specifying overlap, window functions, scaling etc) is going to be a hard thing to get right - I imagine a consensus on this might develop over time in user code that can be perhaps incorporated into the API later for performance / ease of use reasons.

@chrislo

This comment has been minimized.

Show comment
Hide comment
@chrislo

chrislo Jan 29, 2015

Member

(That being said, I think it's fantastic that this issue has been raised as a direct consequence of the public Audio WG consultation meeting on Wednesday)

Member

chrislo commented Jan 29, 2015

(That being said, I think it's fantastic that this issue has been raised as a direct consequence of the public Audio WG consultation meeting on Wednesday)

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup Jan 29, 2015

Contributor

Does anyone have any performance numbers on optimised JS implementations of FFT running inside ScriptProcessorNode along with FFT implementation optimised with asm.js + SIMD.js (and maybe also PNaCl)?

I feel knowing performance differences between native and JS implementations would be very important to understand what we're actually gaining from making this functionality into a native Node.

Ofcourse, strictly speaking we should looking at the performance in the AudioWorker (closer to native Nodes in terms of threading), but that will have to wait, so we could start with ScriptProcessorNode for now.

Contributor

notthetup commented Jan 29, 2015

Does anyone have any performance numbers on optimised JS implementations of FFT running inside ScriptProcessorNode along with FFT implementation optimised with asm.js + SIMD.js (and maybe also PNaCl)?

I feel knowing performance differences between native and JS implementations would be very important to understand what we're actually gaining from making this functionality into a native Node.

Ofcourse, strictly speaking we should looking at the performance in the AudioWorker (closer to native Nodes in terms of threading), but that will have to wait, so we could start with ScriptProcessorNode for now.

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni Jan 29, 2015

@notthetup - A good starting point could be this library: https://github.com/corbanbrook/dsp.js/.
Although it's not asm.js optimized, the RFFT class in the lib performs a forward FFT.

A good native FFT implementation is http://www.fftw.org/, for comparison.

@notthetup - A good starting point could be this library: https://github.com/corbanbrook/dsp.js/.
Although it's not asm.js optimized, the RFFT class in the lib performs a forward FFT.

A good native FFT implementation is http://www.fftw.org/, for comparison.

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni Jan 29, 2015

@rtoy:

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already. It's easy enough for an
AudioWorkerNode to do an FFT internally using any of the available JS FFT
libraries out there.​

Yep, but isn't it re-inventing the [possibly square] wheel? The FFT is a building block of DSP, it makes sense to have it implemented efficiently and natively, without everyone having to re-implement it, or using libraries that re-implement it.

Compare to oscillators: it's reasonably easy to implement an oscillator in any given programming language, but the Web Audio API implements them natively and efficiently, with a single interface, because they're elementary building blocks of audio processing. Why the FFT shouldn't be treated equally?

@rtoy:

​I think that if you're manipulating things in the frequency domain, you
have a fair amount of DSP knowledge already. It's easy enough for an
AudioWorkerNode to do an FFT internally using any of the available JS FFT
libraries out there.​

Yep, but isn't it re-inventing the [possibly square] wheel? The FFT is a building block of DSP, it makes sense to have it implemented efficiently and natively, without everyone having to re-implement it, or using libraries that re-implement it.

Compare to oscillators: it's reasonably easy to implement an oscillator in any given programming language, but the Web Audio API implements them natively and efficiently, with a single interface, because they're elementary building blocks of audio processing. Why the FFT shouldn't be treated equally?

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup Jan 29, 2015

Contributor

@janesconference Thanks. I also found this which seems to use asm.js. https://github.com/g200kg/Fft-asm.js/commits/master

I will try to create a example to test performance when I'm back home.

Contributor

notthetup commented Jan 29, 2015

@janesconference Thanks. I also found this which seems to use asm.js. https://github.com/g200kg/Fft-asm.js/commits/master

I will try to create a example to test performance when I'm back home.

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Jan 29, 2015

Contributor

On Wed, Jan 28, 2015 at 4:25 PM, Hugh Rawlinson notifications@github.com
wrote:

I agree with @sebpiq https://github.com/sebpiq, the ability to do
something in AudioWorker doesn't negate the need for a node. You could
implement every single node in the spec with AudioWorker, if that's the
reasoning then why not just make people implement their own Oscillators
with AudioWorker rather than supply OscillatorNode?

​Perhaps if history had proceeded differently and an AudioWorker existed
from the beginning, all nodes would have been defined or implemented in
terms of an AudioWorker. That's not how history went. I'm not sure, but
it's conceivable that implementations could change existing nodes to be
AudioWorkers underneath if so desired.​

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP
technique. I do however think it's good to be able to work in both the time
and frequency domains, and AudioWorker caters for the time domain, so why
not have something that caters for the frequency domain? If you want to
optimise for as few nodes as possible, then surely we should all be
implementing our own oscillators in AudioWorkers rather than using
OscillatorNodes, I don't really think having fewer nodes is a great design
goal. Having general purpose nodes though, is, and I think
AudioFrequencyWorker is general enough to warrant being a node.

​I grant that it is useful. But not beyond what an AudioWorker could do
just as well.​

​And given the fact that even with the few nodes we have today, we still
have failed after several years to specify them completely enough that
someone could implement them from the spec. Fewer nodes are good. :-)​

@adelespinasse https://github.com/adelespinasse @hughrawlinson
https://github.com/hughrawlinson That seems to be a nice complement for
AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside
of AudioWorker, but the spec doesn't seem to be progressing, and as it's a
separate spec it may not be ready for years even after the Web Audio API is
released.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented Jan 29, 2015

On Wed, Jan 28, 2015 at 4:25 PM, Hugh Rawlinson notifications@github.com
wrote:

I agree with @sebpiq https://github.com/sebpiq, the ability to do
something in AudioWorker doesn't negate the need for a node. You could
implement every single node in the spec with AudioWorker, if that's the
reasoning then why not just make people implement their own Oscillators
with AudioWorker rather than supply OscillatorNode?

​Perhaps if history had proceeded differently and an AudioWorker existed
from the beginning, all nodes would have been defined or implemented in
terms of an AudioWorker. That's not how history went. I'm not sure, but
it's conceivable that implementations could change existing nodes to be
AudioWorkers underneath if so desired.​

​Is the intent to supply a huge set of nodes where you can do just about
any general purpose DSP technique as if you were using, say, Matlab-like
clone running in a browser? I don't think that's the goal.​

I don't think you need a huge set of nodes to do any general purpose DSP
technique. I do however think it's good to be able to work in both the time
and frequency domains, and AudioWorker caters for the time domain, so why
not have something that caters for the frequency domain? If you want to
optimise for as few nodes as possible, then surely we should all be
implementing our own oscillators in AudioWorkers rather than using
OscillatorNodes, I don't really think having fewer nodes is a great design
goal. Having general purpose nodes though, is, and I think
AudioFrequencyWorker is general enough to warrant being a node.

​I grant that it is useful. But not beyond what an AudioWorker could do
just as well.​

​And given the fact that even with the few nodes we have today, we still
have failed after several years to specify them completely enough that
someone could implement them from the spec. Fewer nodes are good. :-)​

@adelespinasse https://github.com/adelespinasse @hughrawlinson
https://github.com/hughrawlinson That seems to be a nice complement for
AudioWorker.

Yeah, the Web Array Math API functions would definitely be useful inside
of AudioWorker, but the spec doesn't seem to be progressing, and as it's a
separate spec it may not be ready for years even after the Web Audio API is
released.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni Jan 30, 2015

​Perhaps if history had proceeded differently and an AudioWorker existed
from the beginning, all nodes would have been defined or implemented in
terms of an AudioWorker. That's not how history went. I'm not sure, but
it's conceivable that implementations could change existing nodes to be
AudioWorkers underneath if so desired.​

That's interesting. I was under the impression that nodes were specialized, fast and memory-friendly ways to implement building blocks? Wouldn't be penalizing implementing them via a worker?
(e.g.: easily discardable AudioBufferSourceNodes)

​Perhaps if history had proceeded differently and an AudioWorker existed
from the beginning, all nodes would have been defined or implemented in
terms of an AudioWorker. That's not how history went. I'm not sure, but
it's conceivable that implementations could change existing nodes to be
AudioWorkers underneath if so desired.​

That's interesting. I was under the impression that nodes were specialized, fast and memory-friendly ways to implement building blocks? Wouldn't be penalizing implementing them via a worker?
(e.g.: easily discardable AudioBufferSourceNodes)

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup Feb 24, 2015

Contributor

Sorry this took a while. But I have a basic performance test setup for how long it takes to run FFTs using some of the popular FFT libraries, plotted against ScriptProcessor callback times. Of course we would want to do this with the AudioWorker when it comes, but this gives us data points to start with. Also please let me know if you see any bugs.

http://chinpen.net/webaudiofftperf/

Contributor

notthetup commented Feb 24, 2015

Sorry this took a while. But I have a basic performance test setup for how long it takes to run FFTs using some of the popular FFT libraries, plotted against ScriptProcessor callback times. Of course we would want to do this with the AudioWorker when it comes, but this gives us data points to start with. Also please let me know if you see any bugs.

http://chinpen.net/webaudiofftperf/

@jakubfiala

This comment has been minimized.

Show comment
Hide comment
@jakubfiala

jakubfiala Feb 24, 2015

@notthetup awesome! As far as I understand, the fft-asm.js option isn't working yet, right?

@notthetup awesome! As far as I understand, the fft-asm.js option isn't working yet, right?

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup Feb 24, 2015

Contributor

@jakubfiala Yes. I haven't implemented that yet. Hopefully this weekend.

Contributor

notthetup commented Feb 24, 2015

@jakubfiala Yes. I haven't implemented that yet. Hopefully this weekend.

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup Apr 30, 2015

Contributor

@echo66 Thanks!! Will look at those

If anyone has some time (I'm really flooded for the next few weeks), please feel free to PR https://github.com/notthetup/webaudiofftperf

Contributor

notthetup commented Apr 30, 2015

@echo66 Thanks!! Will look at those

If anyone has some time (I'm really flooded for the next few weeks), please feel free to PR https://github.com/notthetup/webaudiofftperf

@padenot padenot referenced this issue May 11, 2015

Closed

where ??? #531

@mdjp mdjp modified the milestone: Uncommitted May 14, 2015

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 20, 2015

Greetings to everyone!

In the last month, I have been working in real-time time stretching and pitch shifting using a phase vocoder, using my own implementations and @janesconference PitchShifter. Currently, there seems to exist a big bottleneck in FFT calculation, resulting in audio dropouts if you have more than 4-5 nodes with time-stretching and/or pitch shifting. For me (at least), the absence of an alternative to the javascript FFT implementations is a major roadblock.

Note 1: I forgot about the pitch shifter implemented by @cwilso . That one doesn't seem to suffer from what I mentioned I the previous paragraph. Unfortunately, for me, it provides a lower (audio) quality than @janesconference implementation or my phase vocoder + resampling.
Note 2: It should be noted that the implementations I have created/experimented with do not use SIMD.

echo66 commented May 20, 2015

Greetings to everyone!

In the last month, I have been working in real-time time stretching and pitch shifting using a phase vocoder, using my own implementations and @janesconference PitchShifter. Currently, there seems to exist a big bottleneck in FFT calculation, resulting in audio dropouts if you have more than 4-5 nodes with time-stretching and/or pitch shifting. For me (at least), the absence of an alternative to the javascript FFT implementations is a major roadblock.

Note 1: I forgot about the pitch shifter implemented by @cwilso . That one doesn't seem to suffer from what I mentioned I the previous paragraph. Unfortunately, for me, it provides a lower (audio) quality than @janesconference implementation or my phase vocoder + resampling.
Note 2: It should be noted that the implementations I have created/experimented with do not use SIMD.

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso May 20, 2015

Contributor

The pitch shifting I did is granular resynthesis, which is going to be lower audio quality. I'm expecting a lot of the dropouts you're experiencing are because you're using ScriptProcessors for the nodes (yes?), and the thread-hopping starts thrashing.

Contributor

cwilso commented May 20, 2015

The pitch shifting I did is granular resynthesis, which is going to be lower audio quality. I'm expecting a lot of the dropouts you're experiencing are because you're using ScriptProcessors for the nodes (yes?), and the thread-hopping starts thrashing.

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 20, 2015

Yes, I'm using ScriptProcessors, @cwilso .

echo66 commented May 20, 2015

Yes, I'm using ScriptProcessors, @cwilso .

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni May 21, 2015

I'll point out you have a Convolver because room effects like reverb are super-common; the super-sophistication of Panner I think was a mistake, personally, and that's why I've pushed for StereoPanner in the spec, but everybody also wants to pan; there's a (relatively simplistic) Analyser because freakin' EVERYBODY wants music/audio visualization, and it's (compared to doing your own FFT when you're not a DSP expert) very easy to use. And dynamics compression is in there because it's very easy to mess up and overdrive your output signal, and digital clipping sucks.

And I personally love all of them, even the 3d panner that always shifts my output samples in wav.hya.io. Compared to the old times of the now-defunct FF AudioData API (I think it was called like that), WAA is super easy to use and provides native, efficient ways to use DSP building blocks.

The first version of my phase vocoding pitchshifter was for Audio Data, in Firefox 4.0, I think, and it had been a nightmare to get right. I guess that the bottom line of all this (AD dead, WAA alive and well) is that sometimes you have to go native, and I wouldn't go back to pulling samples and getting 100% CPU after chaining 3 real time effects.

That said, I guess that, like EVERYBODY wants efficient and cool 3d panning, impulse convolving and easy filtering, there's a good percent of all those everybodies that would like to mess with the spectral frequency and do vocoding, pitch and time shifting, frequency-domain filtering, audio fingerprinting and loads of other cool stuff.

I'll point out you have a Convolver because room effects like reverb are super-common; the super-sophistication of Panner I think was a mistake, personally, and that's why I've pushed for StereoPanner in the spec, but everybody also wants to pan; there's a (relatively simplistic) Analyser because freakin' EVERYBODY wants music/audio visualization, and it's (compared to doing your own FFT when you're not a DSP expert) very easy to use. And dynamics compression is in there because it's very easy to mess up and overdrive your output signal, and digital clipping sucks.

And I personally love all of them, even the 3d panner that always shifts my output samples in wav.hya.io. Compared to the old times of the now-defunct FF AudioData API (I think it was called like that), WAA is super easy to use and provides native, efficient ways to use DSP building blocks.

The first version of my phase vocoding pitchshifter was for Audio Data, in Firefox 4.0, I think, and it had been a nightmare to get right. I guess that the bottom line of all this (AD dead, WAA alive and well) is that sometimes you have to go native, and I wouldn't go back to pulling samples and getting 100% CPU after chaining 3 real time effects.

That said, I guess that, like EVERYBODY wants efficient and cool 3d panning, impulse convolving and easy filtering, there's a good percent of all those everybodies that would like to mess with the spectral frequency and do vocoding, pitch and time shifting, frequency-domain filtering, audio fingerprinting and loads of other cool stuff.

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso May 21, 2015

Contributor

I still think we need an audio-data-level API (but much better designed, e.g. NOT in the main thread); and I also think we need hooks to enable low-level innovation like spectral frequency, pitch/time shifting, arbitrary FFT filtering, etc - but I will state that I think the number of users of those features will be far fewer (as they require much more expertise) than, say, simple panning and reverb. Not less important; just fewer.

Contributor

cwilso commented May 21, 2015

I still think we need an audio-data-level API (but much better designed, e.g. NOT in the main thread); and I also think we need hooks to enable low-level innovation like spectral frequency, pitch/time shifting, arbitrary FFT filtering, etc - but I will state that I think the number of users of those features will be far fewer (as they require much more expertise) than, say, simple panning and reverb. Not less important; just fewer.

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni May 21, 2015

Probably they will be less, but what about the # of higher-level users indirectly using those features? Your users are developers, but they have end and intermediate users in turn. Moving audio to the web could shift its potential userbase from apps (or at least that's what anyone in this thread hopes).

Probably they will be less, but what about the # of higher-level users indirectly using those features? Your users are developers, but they have end and intermediate users in turn. Moving audio to the web could shift its potential userbase from apps (or at least that's what anyone in this thread hopes).

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 22, 2015

@chrislo , one question: I know this might be a little offtopic (or not, let's see) but instead of just thinking about a DSP standard for audio, why not make it general for other devices that the browser might detect? Nowadays, the amount of sensors and input devices you have in a smartphone is bigger than any "average joe" might ever imagine. They all provide data feeds/signals to be processed in a way or another. Each device provides signals with a specific amount of dimensions: the web cam is 2D, web audio is 1D (per channel), gyroscope is 3D, compass is 8D. So, maybe W3C should start sketching a way to offer DSP operations on N-Dimensions data.

This is just a suggestion, and an offtopic as you might think. Probably, it is not even an original suggestion.

echo66 commented May 22, 2015

@chrislo , one question: I know this might be a little offtopic (or not, let's see) but instead of just thinking about a DSP standard for audio, why not make it general for other devices that the browser might detect? Nowadays, the amount of sensors and input devices you have in a smartphone is bigger than any "average joe" might ever imagine. They all provide data feeds/signals to be processed in a way or another. Each device provides signals with a specific amount of dimensions: the web cam is 2D, web audio is 1D (per channel), gyroscope is 3D, compass is 8D. So, maybe W3C should start sketching a way to offer DSP operations on N-Dimensions data.

This is just a suggestion, and an offtopic as you might think. Probably, it is not even an original suggestion.

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup May 22, 2015

Contributor

I'm personally weary of pushing down more functionality into WAA spec.

I'll take ConvolverNode for example. I think the multi-threaded reverb convolver implementation is really great. It works really well in that specific use-case, but the nature of having that baked into the WAA with limited (only AudioParam) controls makes it unusable at some times.

For example, while trying to extend http://chinpen.net/auralizr/, I needed to change the impulse responses for the ConvolverNode in pseudo-realtime. The current implementations dump the convolution buffer every time the impulse response property is changed.

Now I know my use case is pretty uncommon, but since the implementation is in the spec, it can't be changed/tweaked. If the whole thing was implemented in JS (and yes I understand the issue with performance) I could have easily changed the setter for the impulse response property.

I feel there is much more to gain in working with 'userland' JS implementations of DSP functionality and getting them to perform better (asm.js, SIMD.js, etc) for the ability of tweaking them, changing them, updating them without needing to involve the browsers vendors. Faster turn around, and more control.

I understand we're not there yet in terms of JS performance, but pushing this into the spec will take time, and by the time it gets implemented, it might not have that much improved performance compared to JS implementations.

Finally, I really think this discussions needs real world numbers. I started a tiny project which tries (very unscientifically) to look at the performance of JS FFT with various libraries. Please feel free to fork/improve it. It would be great to see how much performance would a browser implementation add to this.

http://chinpen.net/webaudiofftperf/

Contributor

notthetup commented May 22, 2015

I'm personally weary of pushing down more functionality into WAA spec.

I'll take ConvolverNode for example. I think the multi-threaded reverb convolver implementation is really great. It works really well in that specific use-case, but the nature of having that baked into the WAA with limited (only AudioParam) controls makes it unusable at some times.

For example, while trying to extend http://chinpen.net/auralizr/, I needed to change the impulse responses for the ConvolverNode in pseudo-realtime. The current implementations dump the convolution buffer every time the impulse response property is changed.

Now I know my use case is pretty uncommon, but since the implementation is in the spec, it can't be changed/tweaked. If the whole thing was implemented in JS (and yes I understand the issue with performance) I could have easily changed the setter for the impulse response property.

I feel there is much more to gain in working with 'userland' JS implementations of DSP functionality and getting them to perform better (asm.js, SIMD.js, etc) for the ability of tweaking them, changing them, updating them without needing to involve the browsers vendors. Faster turn around, and more control.

I understand we're not there yet in terms of JS performance, but pushing this into the spec will take time, and by the time it gets implemented, it might not have that much improved performance compared to JS implementations.

Finally, I really think this discussions needs real world numbers. I started a tiny project which tries (very unscientifically) to look at the performance of JS FFT with various libraries. Please feel free to fork/improve it. It would be great to see how much performance would a browser implementation add to this.

http://chinpen.net/webaudiofftperf/

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 22, 2015

@artofmus ,

var size = 2048;
var wantedSize = 1025; // I just want the first half of the spectrum.
var stdlib = {
    Math: Math,
    Float32Array: Float32Array,
    Float64Array: Float64Array
};
var heap = fourier.custom.alloc(size, 3);
// For each custom FFT, you may choose if you want Float32 or Float64. Additionally, you must say if you want to use the asm or the raw version.
var fft = fourier.custom["fft_f32_"+size+"_asm"](stdlib, null, heap);
fft.init();

var real = new Float32Array(wantedSize);
var imag = new Float32Array(wantedSize);


// Forward FFT
fourier.custom.array2heap(timeframe, new Float32Array(heap), size, size);
fourier.custom.array2heap(new Float32Array(size), new Float32Array(heap), size, size);
fft.transform();
fourier.custom.heap2array(new Float32Array(heap), real, wantedSize, 0);
fourier.custom.heap2array(new Float32Array(heap), imag, wantedSize, size);


// Inverse FFT
fourier.custom.array2heap(real, new Float32Array(heap), size, 0);
fourier.custom.array2heap(real, new Float32Array(heap), size, size);
fft.transform();
timeframe.set(new Float32Array(heap, size, size));

// Do not forget to normalize the IFFT output
for (var i=0; i<size; i++) {
  timeframe[i] /= size;
}
´´´

echo66 commented May 22, 2015

@artofmus ,

var size = 2048;
var wantedSize = 1025; // I just want the first half of the spectrum.
var stdlib = {
    Math: Math,
    Float32Array: Float32Array,
    Float64Array: Float64Array
};
var heap = fourier.custom.alloc(size, 3);
// For each custom FFT, you may choose if you want Float32 or Float64. Additionally, you must say if you want to use the asm or the raw version.
var fft = fourier.custom["fft_f32_"+size+"_asm"](stdlib, null, heap);
fft.init();

var real = new Float32Array(wantedSize);
var imag = new Float32Array(wantedSize);


// Forward FFT
fourier.custom.array2heap(timeframe, new Float32Array(heap), size, size);
fourier.custom.array2heap(new Float32Array(size), new Float32Array(heap), size, size);
fft.transform();
fourier.custom.heap2array(new Float32Array(heap), real, wantedSize, 0);
fourier.custom.heap2array(new Float32Array(heap), imag, wantedSize, size);


// Inverse FFT
fourier.custom.array2heap(real, new Float32Array(heap), size, 0);
fourier.custom.array2heap(real, new Float32Array(heap), size, size);
fft.transform();
timeframe.set(new Float32Array(heap, size, size));

// Do not forget to normalize the IFFT output
for (var i=0; i<size; i++) {
  timeframe[i] /= size;
}
´´´
@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni May 22, 2015

I understand we're not there yet in terms of JS performance
by the time it gets implemented, it might not have that much improved performance compared to JS implementations

I wish, but my impression (and my fear) is that every year we're on the verge of being almost there, but we never get there. Like a "this is the year of Linux on desktop" situation.

I understand we're not there yet in terms of JS performance
by the time it gets implemented, it might not have that much improved performance compared to JS implementations

I wish, but my impression (and my fear) is that every year we're on the verge of being almost there, but we never get there. Like a "this is the year of Linux on desktop" situation.

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 22, 2015

Well, let's just put it this way: currently, I can't apply high quality time stretching + pitch shifting using JS implementations of FFT, without having A LOT of audio drops for more than two/three tracks playing in stereo (and I'm using a blank window!). Want a Traktor Pro in javascript? Well, too bad.

echo66 commented May 22, 2015

Well, let's just put it this way: currently, I can't apply high quality time stretching + pitch shifting using JS implementations of FFT, without having A LOT of audio drops for more than two/three tracks playing in stereo (and I'm using a blank window!). Want a Traktor Pro in javascript? Well, too bad.

@notthetup

This comment has been minimized.

Show comment
Hide comment
@notthetup

notthetup May 22, 2015

Contributor

@echo66 Added some info in ReadMe. I'm travelling right now as well, so don't have much time to update the UI. PM me if you need specific info.

Based on the perf test, for a window of 1k the FFT takes ~0.5msec, but the callback is every ~22msec, which seems enough time for doing multiple channels of FFT. Unless there is something wrong in the perf test (highly likely knowing me..) or your phase-vocoder implementation has other components which take a lot of time too.

Contributor

notthetup commented May 22, 2015

@echo66 Added some info in ReadMe. I'm travelling right now as well, so don't have much time to update the UI. PM me if you need specific info.

Based on the perf test, for a window of 1k the FFT takes ~0.5msec, but the callback is every ~22msec, which seems enough time for doing multiple channels of FFT. Unless there is something wrong in the perf test (highly likely knowing me..) or your phase-vocoder implementation has other components which take a lot of time too.

@artofmus

This comment has been minimized.

Show comment
Hide comment
@artofmus

artofmus May 23, 2015

Сan someone say, AudioWorkerNode now available only in the form of js library (http://mohayonao.github.io/audio-worker-node/)? or in som alpha versions of any browsers that it is already available? if not when it is planned to implement?

Сan someone say, AudioWorkerNode now available only in the form of js library (http://mohayonao.github.io/audio-worker-node/)? or in som alpha versions of any browsers that it is already available? if not when it is planned to implement?

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso May 23, 2015

Contributor

@echo66 if you're making that statement based on a JS implementation in ScriptProcessor, you're comparing apples and oranges - ScriptProcessor is both very costly and very poorly predictable due to its cross-thread nature. AudioWorker is intended to remove that problem, and the remaining stress will be from pure JS perf, which is both 1) rarely as bad as people think it is, and 2) improving daily.

@artofmus to my knowledge, no one has begun a browser implementation of AudioWorker yet.

Contributor

cwilso commented May 23, 2015

@echo66 if you're making that statement based on a JS implementation in ScriptProcessor, you're comparing apples and oranges - ScriptProcessor is both very costly and very poorly predictable due to its cross-thread nature. AudioWorker is intended to remove that problem, and the remaining stress will be from pure JS perf, which is both 1) rarely as bad as people think it is, and 2) improving daily.

@artofmus to my knowledge, no one has begun a browser implementation of AudioWorker yet.

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 23, 2015

@cwilso , I understand that there is a big bottleneck with ScriptProcessor but you should not forget one thing: even in a DAW like Ableton Live, there is a point when you need to "freeze" some tracks in order to avoid distortion and audio drops due to heavy CPU duties. And this happens even with C/C++ FFT implementations for effects. Of course you only face this issue if you create many tracks. But, in the browser, I wont be surprised (actually, Im betting on it) to see more audio drops and distortion than in a C/C++ app. And tools spectral analysis play have a big performance footprint: FFT is O(N log N) and even the sliding FFT is O(N). So, it is a sensitive issue.

I understand your stance regarding adding a spectral analysis node. But if you want to make the web browser very attractive to big players in music production, the issue with analysis performance must be tackle sooner or later.

I think we all just announced our stance in this issue. No need to drag this much more.

@artofmus , I advise you to take a look at that code. That polyfill/shim provides just the AudioWorker API. It does not use a different thread for AudioWorker.

echo66 commented May 23, 2015

@cwilso , I understand that there is a big bottleneck with ScriptProcessor but you should not forget one thing: even in a DAW like Ableton Live, there is a point when you need to "freeze" some tracks in order to avoid distortion and audio drops due to heavy CPU duties. And this happens even with C/C++ FFT implementations for effects. Of course you only face this issue if you create many tracks. But, in the browser, I wont be surprised (actually, Im betting on it) to see more audio drops and distortion than in a C/C++ app. And tools spectral analysis play have a big performance footprint: FFT is O(N log N) and even the sliding FFT is O(N). So, it is a sensitive issue.

I understand your stance regarding adding a spectral analysis node. But if you want to make the web browser very attractive to big players in music production, the issue with analysis performance must be tackle sooner or later.

I think we all just announced our stance in this issue. No need to drag this much more.

@artofmus , I advise you to take a look at that code. That polyfill/shim provides just the AudioWorker API. It does not use a different thread for AudioWorker.

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso May 29, 2015

Contributor

To be clear, I don't have a "stance", per se, on a spectral analysis node (aka a native generic FFT node) - other than I think it should be first implemented via AudioWorker and an FFT library, in Extensible Web Manifesto fashion.

You can, of course, use OfflineAudioContexts to freeze tracks. Of course, you will see more CPU load from an FFT load implemented in Javascript than one implemented in C/C++. At the same time, I think you'll find that multiplier is not as big as you think it is; modern JS engines are pretty good at compilation and type optimization, and it's completely tenable to prototype an FFT library in JS before baking that API into Web Audio.

Contributor

cwilso commented May 29, 2015

To be clear, I don't have a "stance", per se, on a spectral analysis node (aka a native generic FFT node) - other than I think it should be first implemented via AudioWorker and an FFT library, in Extensible Web Manifesto fashion.

You can, of course, use OfflineAudioContexts to freeze tracks. Of course, you will see more CPU load from an FFT load implemented in Javascript than one implemented in C/C++. At the same time, I think you'll find that multiplier is not as big as you think it is; modern JS engines are pretty good at compilation and type optimization, and it's completely tenable to prototype an FFT library in JS before baking that API into Web Audio.

@echo66

This comment has been minimized.

Show comment
Hide comment
@echo66

echo66 May 29, 2015

@cwilso, regarding "track freezing" with OfflineAudioContext, it seems that the bug between ScriptProcessors and OfflineAudioContext still exists.

echo66 commented May 29, 2015

@cwilso, regarding "track freezing" with OfflineAudioContext, it seems that the bug between ScriptProcessors and OfflineAudioContext still exists.

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso May 29, 2015

Contributor

Of course it does. It's not worth it to fix, as we're deprecating scriptprocessors and AudioWorkers will not have this problem.

Contributor

cwilso commented May 29, 2015

Of course it does. It's not worth it to fix, as we're deprecating scriptprocessors and AudioWorkers will not have this problem.

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy May 29, 2015

Contributor

Not relevant to this issue, but what is the scriptprocessor and offline
context bug?

On Fri, May 29, 2015 at 10:49 AM, Chris Wilson notifications@github.com
wrote:

Of course it does. It's not worth it to fix, as we're deprecating
scriptprocessors and AudioWorkers will not have this problem.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

Contributor

rtoy commented May 29, 2015

Not relevant to this issue, but what is the scriptprocessor and offline
context bug?

On Fri, May 29, 2015 at 10:49 AM, Chris Wilson notifications@github.com
wrote:

Of course it does. It's not worth it to fix, as we're deprecating
scriptprocessors and AudioWorkers will not have this problem.


Reply to this email directly or view it on GitHub
#468 (comment)
.

Ray

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@cristiano-belloni

cristiano-belloni May 29, 2015

@rtoy yoy can't render a scriptprocessor in an offline context. I don't have the bug report handy atm, but I can find it later.

@rtoy yoy can't render a scriptprocessor in an offline context. I don't have the bug report handy atm, but I can find it later.

@cristiano-belloni

This comment has been minimized.

Show comment
Hide comment
@joeberkovitz

This comment has been minimized.

Show comment
Hide comment
@joeberkovitz

joeberkovitz Jun 1, 2015

Contributor

Our plan is to support this by allowing developers to use AudioWorker for initial takes on this idea, and learn from these implementations when considering a first class spectral-domain feature in future versions of the spec.

Contributor

joeberkovitz commented Jun 1, 2015

Our plan is to support this by allowing developers to use AudioWorker for initial takes on this idea, and learn from these implementations when considering a first class spectral-domain feature in future versions of the spec.

@joeberkovitz joeberkovitz modified the milestones: Web Audio V1, Uncommitted Jun 1, 2015

@cwilso

This comment has been minimized.

Show comment
Hide comment
@cwilso

cwilso Jun 1, 2015

Contributor

Note that the issue here is the complexity of potential options, not the lack of desire to implement such a thing. Step 1, to me, is getting AudioWorker working to remove all the huge amount of performance cost of thread-hopping in ScriptProcessor (not to mention running FFT code in the main JS thread). Step 2 is prototyping frequency domain processors using a JS FFT library, to determine what options are necessarily and sufficient. Step 2a) is prototyping moving that FFT into native code to see if the speed benefits are worthwhile. I would spitball the cost of JS FFT (in asmjs style) vs native to be MAYBE two-to-one at worst; the cost of using ScriptProcessor is far more than that in practice.

Step 3, then is deciding if there is a clear API design that emerges, and a significant benefit to the FFT being in native code.

(Also as an aside, I'd expect it would be more useful to have Math.FFT-style FFT library in native than a Web-Audio-specific one.)

Contributor

cwilso commented Jun 1, 2015

Note that the issue here is the complexity of potential options, not the lack of desire to implement such a thing. Step 1, to me, is getting AudioWorker working to remove all the huge amount of performance cost of thread-hopping in ScriptProcessor (not to mention running FFT code in the main JS thread). Step 2 is prototyping frequency domain processors using a JS FFT library, to determine what options are necessarily and sufficient. Step 2a) is prototyping moving that FFT into native code to see if the speed benefits are worthwhile. I would spitball the cost of JS FFT (in asmjs style) vs native to be MAYBE two-to-one at worst; the cost of using ScriptProcessor is far more than that in practice.

Step 3, then is deciding if there is a clear API design that emerges, and a significant benefit to the FFT being in native code.

(Also as an aside, I'd expect it would be more useful to have Math.FFT-style FFT library in native than a Web-Audio-specific one.)

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy Aug 2, 2016

Contributor

Marking as feature request.

Contributor

rtoy commented Aug 2, 2016

Marking as feature request.

@rickygraham

This comment has been minimized.

Show comment
Hide comment
@rickygraham

rickygraham May 6, 2017

Hi all. Any more on whether or not we'll see IFFT support anytime soon?

Hi all. Any more on whether or not we'll see IFFT support anytime soon?

@rtoy

This comment has been minimized.

Show comment
Hide comment
@rtoy

rtoy May 8, 2017

Contributor

If you mean the AudioFrequencyWorkerNode, then, no; it's still marked as v.next so something to consider for the next version.

If you're asking about access to an IFFT routine, that's a different question. If this is what you want, you should file another issue on that.

Contributor

rtoy commented May 8, 2017

If you mean the AudioFrequencyWorkerNode, then, no; it's still marked as v.next so something to consider for the next version.

If you're asking about access to an IFFT routine, that's a different question. If this is what you want, you should file another issue on that.

@rickygraham

This comment has been minimized.

Show comment
Hide comment
@rickygraham

rickygraham May 8, 2017

It is the latter. I would like access to an IFFT routine.

It is the latter. I would like access to an IFFT routine.

@mattdiamond

This comment has been minimized.

Show comment
Hide comment
@mattdiamond

mattdiamond Jun 19, 2018

This would be a pretty cool feature! If you want an example of something you can do with FFT/IFFT, I put together a frequency phase scrambler a little while back. Fun way to generate drones.

This would be a pretty cool feature! If you want an example of something you can do with FFT/IFFT, I put together a frequency phase scrambler a little while back. Fun way to generate drones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment