Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnalyserNode design issues #86

Closed
olivierthereaux opened this issue Sep 11, 2013 · 7 comments
Closed

AnalyserNode design issues #86

olivierthereaux opened this issue Sep 11, 2013 · 7 comments
Assignees
Labels
Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/
Milestone

Comments

@olivierthereaux
Copy link
Contributor

Originally reported on W3C Bugzilla ISSUE-17361 Tue, 05 Jun 2012 11:49:46 GMT
Reported by Philip Jägenstedt
Assigned to

Audio-ISSUE-74 (RealtimeAnalyserNode): RealtimeAnalyserNode design [Web Audio API]

http://www.w3.org/2011/audio/track/issues/74

Raised by: Philip Jägenstedt
On product: Web Audio API

The RealtimeAnalyserNode has so many problems that we will put them into a single issue.

The use case appears to be probing/polling the signal for visualization, where it does not matter if all of the signal is available. It also doesn't modify the output, so it need not delay the processing at all, like JavaScriptAudioNode would.

The problems identified with the current spec are:

  1. It is undefined how multi-channel input maps to time/frequency data, which are both single arrays.
  2. The layout/order of the frequency bins is undefined. Are the negative frequencies included?
  3. It is undefined what happens if the array to the getters has more elements than frequencyBinCount.
  4. smoothingTimeConstant is defined only as "A value from 0 -> 1 where 0 represents no time averaging with the last analysis frame." How does it affect time/frequency domain data? What is an analysis frame and which is the last?
  5. If frequencyBinCount == fftSize / 2, why is it exposed at all?
  6. minDecibels/maxDecibels are undefined. Do minDecibels/maxDecibels control the output of getByteFrequencyData, or do they describe it? Are the parameters only used for getByteFrequencyData()? If so, why are they not arguments to that method? How does the Uint8 range (0-255) map to the decibel range (minDecibels-maxDecibels)?
  7. What happens if fftSize is set to something that is not a power of two? Are there any limits? Are 1 and 2^32 both valid values?
  8. If the fftSize is initially set to 1, then changed to 2^32, what should the getters do? To not restrict this requires unbounded buffering to handle an arbitrarily large fftSize.
  9. It's not at all clear why there are Uint8Array getters, instead of simply a frequency domain and time domain getter, both as Float32Array.

For the use case we're aware of, this can be simplified greatly. We'd prefer an interface that just allows probing the most recent time domain data as an AudioBuffer and leave it up to the Web developer to perform the FFT by other means. A fast, generic FFT function can be very useful not only for visualization, but also for synthesis, filters etc. In the absence of a native FFT implementation (which could be part of another specification - perhaps add it to the Math object), a custom JavaScript FFT implementation will most likely suffice for most applications.

For example:

interface AudioProbe : AudioNode {
    // get the most recent data available.
    AudioBuffer getData();
}

// in AudioContext, the size must be given up-front and cannot change
AudioProbe createAudioProbe(in unsigned long bufferSize);

Depending on how https://www.w3.org/2011/audio/track/issues/28 is resolved, we could simply have an attribute "AudioBuffer data" that is guaranteed to be stable while the script is executing, to avoid the use of a getter function altogether.

@joeberkovitz
Copy link
Contributor

It seems that the clarifications suggested by Philip represent a set of non-breaking-change edits to the spec that would be useful.

@cwilso cwilso changed the title (RealtimeAnalyserNode): RealtimeAnalyserNode design AnalyserNode design issues Oct 23, 2014
@cwilso
Copy link
Contributor

cwilso commented Oct 23, 2014

Related: #377

@cwilso cwilso added Clarification (Requires change) Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/ and removed Architectural/Fundamental (Breaking change) labels Oct 30, 2014
@cwilso cwilso added this to the Web Audio Last Call 1 milestone Oct 30, 2014
@joeberkovitz
Copy link
Contributor

Fix plan for this issue:

"It is undefined how multi-channel input maps to time/frequency data, which are both single arrays.": Added explicit down-mixing operation to FFT computation.

"The layout/order of the frequency bins is undefined. Are the negative frequencies included?":
"It is undefined what happens if the array to the getters has more elements than frequencyBinCount."
"smoothingTimeConstant is defined only as "A value from 0 -> 1 where 0 represents no time averaging with the last analysis frame." How does it affect time/frequency domain data? What is an analysis frame and which is the last?"
Previously addressed.

"If frequencyBinCount == fftSize / 2, why is it exposed at all?"
I don't see a problem retaining frequencyBinCount although it seems unnecessary

"minDecibels/maxDecibels are undefined. Do minDecibels/maxDecibels control the output of getByteFrequencyData, or do they describe it? Are the parameters only used for getByteFrequencyData()? If so, why are they not arguments to that method? How does the Uint8 range (0-255) map to the decibel range (minDecibels-maxDecibels)?"

"What happens if fftSize is set to something that is not a power of two? Are there any limits? Are 1 and 2^32 both valid values?"

"If the fftSize is initially set to 1, then changed to 2^32, what should the getters do? To not restrict this requires unbounded buffering to handle an arbitrarily large fftSize."

Previously addressed.

"It's not at all clear why there are Uint8Array getters, instead of simply a frequency domain and time domain getter, both as Float32Array."
I don't see a strong case for removing the Uint8 version at this point although I also fail to understand why it exists.

@joeberkovitz joeberkovitz self-assigned this Oct 14, 2015
@joeberkovitz
Copy link
Contributor

If anyone feels strongly that getByteFrequencyData() or frequencyBinCount should be removed from the AnalyserNode spec (see original comments by Philip) please say so. Otherwise I plan to retain.

@rtoy
Copy link
Member

rtoy commented Oct 14, 2015

On Wed, Oct 14, 2015 at 9:32 AM, Joe Berkovitz notifications@github.com
wrote:

I don't see a strong case for removing the Uint8 version at this point
although I also fail to understand why it exists.

​I'm guessing Chris Rogers added this to make visualizations using the time
and frequency data easier since the user doesn't have to do scaling of the
float values himself.​


Reply to this email directly or view it on GitHub
#86 (comment)
.

Ray

@joeberkovitz
Copy link
Contributor

@rtoy Thanks. Please let me know on #629 if this is good to merge.

rtoy added a commit that referenced this issue Dec 3, 2015
Fix #86 by clarifying channel down-mixing.
@rtoy
Copy link
Member

rtoy commented Dec 7, 2015

Just wanted to note that this has memory and/or cpu implications because you either have to keep all the channels in memory just in case someone calls getFoo (at which point you downmix), or you always have to downmix just in case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/
Projects
None yet
Development

No branches or pull requests

5 participants