Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(OscillatorFolding): Oscillator folding considerations #127

Closed
olivierthereaux opened this issue Sep 11, 2013 · 7 comments
Closed

(OscillatorFolding): Oscillator folding considerations #127

olivierthereaux opened this issue Sep 11, 2013 · 7 comments

Comments

@olivierthereaux
Copy link
Contributor

Originally reported on W3C Bugzilla ISSUE-17404 Tue, 05 Jun 2012 12:29:45 GMT
Reported by Michael[tm] Smith
Assigned to

Audio-ISSUE-85 (OscillatorFolding): Oscillator folding considerations [Web Audio API]

http://www.w3.org/2011/audio/track/issues/85

Raised by: Philip Jägenstedt
On product: Web Audio API

It is not defined how the time-domain signal of an oscillator is generated. It would appear that the main reason for WaveTable being defined in the frequency domain is to allow for Nyquist-correct signal synthesis. For example, if the Oscillator frequency is 1000 Hz and the WaveTable has length 4096, the highest frequency component will be 4096 KHz, which could cause folding artifacts.

Depending on how the time-domain signal is generated, the anti-aliasing performed would sound very different.

For example, if the signal is generated in the naive way be looping over the output for each frequency component, one could simply stop before the Nyquist frequency. However, this approach could be very slow.

If nothing is done to prevent folding, the purpose of having a frequency-domain WaveTable at all is questionable.

Finally, should the built-in types (SINE, SQUARE, etc) also be generated using WaveTables internally and be subject to the same folding processing as custom WaveTables?

@olivierthereaux
Copy link
Contributor Author

Original comment by Olivier Thereaux on W3C Bugzilla. Thu, 07 Jun 2012 08:17:25 GMT

[admin] Assigning items currently being worked on by editor.

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Thu, 07 Jun 2012 19:12:51 GMT

More detailed background added:
https://dvcs.w3.org/hg/audio/rev/afb5ef123c50

Similar to how we do not specify specific anti-aliasing algorithms for lines circles in the Canvas 2D specification and the exact image resizing smoothing for , I don't think we should specify the exact rendering here, but instead need to define the precise "ideal" rendering which an actual implementation should strive to achieve.

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 12 Jun 2012 09:49:09 GMT

Over all, the new text is non-normative, except for the phrasing "care must be taken to discard (filter out) the high-frequency information". Here, it is said that something must be done, without specifying what must be done.

At this point, I don't really have a preference for whether we should strive to have a common method for synthesizing sound, or allow for variations between implementations. However, I think it should be clear what the upper/lower quality bound is.

For instance, if we disregard the anti-aliasing requirement, it would be possible for an implementation to simply do an inverse FFT of the wave table as a pre-processing step, and then do nearest neighbor interpolation into that time-domain signal without any anti-alising or interpolation efforts at all. Would that be acceptable?

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Sat, 06 Oct 2012 01:24:02 GMT

(In reply to comment #3)

Over all, the new text is non-normative, except for the phrasing "care must be
taken to discard (filter out) the high-frequency information". Here, it is said
that something must be done, without specifying what must be done.

At this point, I don't really have a preference for whether we should strive to
have a common method for synthesizing sound, or allow for variations between
implementations. However, I think it should be clear what the upper/lower
quality bound is.

For instance, if we disregard the anti-aliasing requirement, it would be
possible for an implementation to simply do an inverse FFT of the wave table as
a pre-processing step, and then do nearest neighbor interpolation into that
time-domain signal without any anti-alising or interpolation efforts at all.
Would that be acceptable?

From a purist perspective, I don't consider that an acceptable technique for synthesis of high-quality oscillators because it will generate considerable aliasing. But, I consider it ok for a basic implementation, especially if it's used as a performance optimization for low-end hardware. Once again, I'd make the analogy with drawing lines. It's "allowed" for a browser to draw jagged lines, but they might not look so great compared with nicely anti-aliased smooth lines.

I'm happy to share implementation techniques for getting reasonably high-quality oscillators. In WebKit, the approach we're currently taking is a multi-table approach where we generate a dozen or so tables with successively filtered out partials, then index dynamically into the table appropriate for the instantaneous playback frequency. We have code to share, or we could discuss the general approach in more technical detail (without code). In any case, that would be an informative section if we added something to the spec there.

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 16 Oct 2012 12:39:10 GMT

But, I consider it ok for a basic implementation, especially if
it's used as a performance optimization for low-end hardware.

My main concern here is the wording in the spec. As it is now, the only "must" in the text is used in conjunction with a sections that seems to be non-normative. If interpreted as a normative statement (which would currently be a correct interpretation of the spec), an implementation MUST filter out frequencies above the Nyquist frequency.

Suggestion: Make the first part of section 2.23 non-normative (the text before "Both .frequency and .detune are a-rate parameters..."), and drop the "must" from the sentence "care must be taken to discard (filter out) the high-frequency information".

@padenot
Copy link
Member

padenot commented Sep 16, 2013

Are we okay with replacing the MUST in "care must be taken to discard (filter out) the high-frequency information" by a a SHOULD? Filtering high frequencies out of a square wave used as a 8Hz LFO is certainly a reasonable exception, in the sense of the SHOULD of rfc2119.

I'd say making sure frequencies above Nyquist aren't folding back in the audible range should be required for a conforming implementation, since not doing so makes a very noticeable difference.

@padenot
Copy link
Member

padenot commented Sep 30, 2013

As per ACTION-73, we are keeping MUST here, but a way to request mathematically correct oscillators will be added in issue 244.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants