Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(setValueCurveAtTime): AudioParam.setValueCurveAtTime #131

Closed
olivierthereaux opened this issue Sep 11, 2013 · 19 comments
Closed

(setValueCurveAtTime): AudioParam.setValueCurveAtTime #131

olivierthereaux opened this issue Sep 11, 2013 · 19 comments
Labels
Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/
Milestone

Comments

@olivierthereaux
Copy link
Contributor

Originally reported on W3C Bugzilla ISSUE-17335 Tue, 05 Jun 2012 11:26:17 GMT
Reported by Philip Jägenstedt
Assigned to

Audio-ISSUE-39 (setValueCurveAtTime): AudioParam.setValueCurveAtTime [Web Audio API]

http://www.w3.org/2011/audio/track/issues/39

Raised by: Philip Jägenstedt
On product: Web Audio API

The interpolation of values is undefined, the spec only says "will be scaled to fit into the desired duration." The duration parameter is also completely wrong, apparently copy-pasted from setTargetValueAtTime: "time-constant value of first-order filter (exponential) approach to the target value."

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Fri, 08 Jun 2012 20:01:20 GMT

Much more detailed text added in:
https://dvcs.w3.org/hg/audio/rev/14ffd37fc7ca

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 12 Jun 2012 07:00:29 GMT

I think the level of detail in the new text is good. One thing that seems to be missing though is what the value is when t < time and t >= time + duration, respectively (most likely values[0] and values[N-1], respectively).

Also, the expression "v(t) = values[N * (t - time) / duration]" is effectively nearest interpolation. Is that intended? Linear interpolation seems more logical.

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Tue, 04 Dec 2012 23:28:21 GMT

(In reply to comment #2)

I think the level of detail in the new text is good. One thing that seems to
be missing though is what the value is when t < time and t >= time +
duration, respectively (most likely values[0] and values[N-1], respectively).

Also, the expression "v(t) = values[N * (t - time) / duration]" is
effectively nearest interpolation. Is that intended? Linear interpolation
seems more logical.

The idea is that the number of points in the Float32Array can be large so that the curve data is effectively over-sampled and linear interpolation is not necessary.

Fixed:
https://dvcs.w3.org/hg/audio/rev/a658660f3174

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Wed, 05 Dec 2012 10:20:25 GMT

(In reply to comment #3)

(In reply to comment #2)

I think the level of detail in the new text is good. One thing that seems to
be missing though is what the value is when t < time and t >= time +
duration, respectively (most likely values[0] and values[N-1], respectively).

Also, the expression "v(t) = values[N * (t - time) / duration]" is
effectively nearest interpolation. Is that intended? Linear interpolation
seems more logical.

The idea is that the number of points in the Float32Array can be large so
that the curve data is effectively over-sampled and linear interpolation is
not necessary.

Fixed:
https://dvcs.w3.org/hg/audio/rev/a658660f3174

Looks good for t >= time + duration. As for t < time, I guess the curve is not active, so needs not be defined (?).

Why don't we want linear interpolation? Linear interpolation would make the interface much easier to use, and could save a lot of memory. E.g. a plain ramp would occupy 256 KB for 16-bit precision without linear interpolation (and require a fair amount of JavaScript processing to generate the ramp). With linear interpolation the same ramp could be accomplished by a 2-entry Float32Array and minimal JavaScript processing.

I don't think that linear interpolation would cost much more in terms of performance, especially not compared to e.g. the exponential ramp.

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Wed, 05 Dec 2012 15:48:51 GMT

(In reply to comment #3)

(In reply to comment #2)

I think the level of detail in the new text is good. One thing that seems to
be missing though is what the value is when t < time and t >= time +
duration, respectively (most likely values[0] and values[N-1], respectively).

Also, the expression "v(t) = values[N * (t - time) / duration]" is
effectively nearest interpolation. Is that intended? Linear interpolation
seems more logical.

The idea is that the number of points in the Float32Array can be large so
that the curve data is effectively over-sampled and linear interpolation is
not necessary.

Fixed:
https://dvcs.w3.org/hg/audio/rev/a658660f3174

That idea assumes the user creates a 'curve' that is itself sufficiently oversampled (has way too much data than needed).
If you want to do it right you should provide an interpolator for undersampled cases and a bandlimiting filter for oversampled cases.
In other words, you should do proper resampling with audio-rate parameters.
Only if the data is played back at its original samplerate can you assume the data is a valid sample.
Since it is an audio rate controller you should always see it as a signal and apply signal theory.

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Thu, 06 Dec 2012 08:38:35 GMT

(In reply to comment #5)

Since it is an audio rate controller you should always see it as a signal
and apply signal theory.

True, but in this case I think that the real use case is to use quite low-frequency signals (like various forms of ramps that run for at least 20 ms or so). For those scenarios, band-limiting should not be necessary. As long as the spec mandates a certain method of interpolation (e.g. nearest, linear or cubic spline), the user knows what to expect and will not try to make other things with it (like modulating a signal with a high-frequency waveform).

Also, I think it's important that all implementations behave equally here, because different interpolation & filtering methods can lead to quite different results. E.g. a 5 second fade-out would sound quite different if it used nearest interpolation instead of cubic spline interpolation. In that respect, a simpler and more performance friendly solution (like nearest or linear interpolation) is better, because it's easier to mandate for all implementations.

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Thu, 06 Dec 2012 15:15:35 GMT

(In reply to comment #6)

(In reply to comment #5)

Since it is an audio rate controller you should always see it as a signal
and apply signal theory.

True, but in this case I think that the real use case is to use quite
low-frequency signals (like various forms of ramps that run for at least 20
ms or so). For those scenarios, band-limiting should not be necessary. As
long as the spec mandates a certain method of interpolation (e.g. nearest,
linear or cubic spline), the user knows what to expect and will not try to
make other things with it (like modulating a signal with a high-frequency
waveform).

Also, I think it's important that all implementations behave equally here,
because different interpolation & filtering methods can lead to quite
different results. E.g. a 5 second fade-out would sound quite different if
it used nearest interpolation instead of cubic spline interpolation. In that
respect, a simpler and more performance friendly solution (like nearest or
linear interpolation) is better, because it's easier to mandate for all
implementations.

I can tell you from years of synthesis experience that the resolution/quality of envelopes is crucial. This is especially true for creating percussive sounds.
Let's say i have a row of samples of an exponentially rising value that i want to use as the attack of my sound (it's modulating a .value of a gain module).
Now what happens if you resample that row at a higher rate than the original is that samples get skipped.
Most importantly, there is a good chance the last value of my exponential curve (the one hitting the maximum) will be X-ed. So suddenly, my percussive sound misses energy! (that's besides the fact that there is also aliasing involved) And moreover, it will sound differently depending on how the curve data is mapped to the output samplerate for that particular note. So any dynamics applied to, say, the time the curve runs will result in inproper spectral changes to my persussive sound.

So undersampling will work well only if people use pre-filtered sampledata as curves and even then there is a chance not all energy will come through as the user must make sure the curve data is never played back too fast.
This is very limiting as such curves are used in the time range of 1ms to minutes. In other words, the curve processor needs to handle an enourmous range of playback rates and the results should be predictable.

With naive undersampling the results will become increasingly unpredictable the more of the curves features (its rough parts) fall outside the audio band frequency wise.
Remember that these curves will be used heavily as envelopes and envelopes have an important role as impulse generators. If you undersample them you litterarily periodically remove energy from the impulse it represents. You need a proper downsample algorithm that will preserve inband energy and hopefully keep the phases together to not smear out the impulse too much.
Otherwise we could just as well go back 15 years to a time when programmers started to try to make musical instruments. ;)
But what worries me more is that it is not clear to the user that the data he uses for the curve might be inproper because of curve playback rate. For instance, how long do i need to make my curve data to not get any problems when i use it for a range between 3 and 3000ms?
I'm not sure people want to think about these things. If you offer such a feature then i'd expect the implementation to deal with it correctly.

About undersampling, after some thought i'd say that both nearest neighbor and linear interpolation could be handy.
The nearest neighor method should be done correctly tho (no idea what the implementations do, but chances are they do it wrong :) ).

Usually such an algorithm has a balance point at .5 poits. A comparison is made to see if the value at a time is closer to the pevious or the next sample and the output will switch halfway between the samples.
This will give problems with short stretched curves. The first sample of the curve will be played back for half a time period (because at 0.5 sample time the switch to the next value is made), then all middle samples will be played at full period (but shifted by 0.5 sampletime) and then the last one at half period again.
A better way would be to make it more like truncating the decimals. This way you ensure every sample value in the curve data will get played for the correct duration which makes much more sense musically.
So for these kinds of things truncation is better than the usual way of proper rounding around the .5 mark.

But then sometimes you don't want to hear these steps at all.
For those cases it would be great if you could switch on some linear interpolator (shouldnt be a bigger hit on performance than the truncation above except in the case when the cpu doesnt do floats well)
Main idea is it should be switchable.

More fancy interpolation is propably not very usefull in this case.

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Fri, 07 Dec 2012 13:15:14 GMT

redman, are you suggesting that other curves (setValueAtTime, linearRampToValueAtTime, exponentialRampToValueAtTime and setTargetAtTime) should be subject to filtering too? Not sure if you can construct a case where you get out-of-band frequencies using those, but I guess you can (e.g. an exponential ramp with a very short duration).

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Fri, 07 Dec 2012 14:29:30 GMT

(In reply to comment #8)

redman, are you suggesting that other curves (setValueAtTime,
linearRampToValueAtTime, exponentialRampToValueAtTime and setTargetAtTime)
should be subject to filtering too? Not sure if you can construct a case
where you get out-of-band frequencies using those, but I guess you can (e.g.
an exponential ramp with a very short duration).

Certainly not! :)
No, just refering to curves.

As for the other other parameters, it would be handy to have a 'better than linear/ramp' interpolator that can be switched on or off.
But (AA)filtering would only be necessary when oversampling, as can be the case with curves.

@olivierthereaux
Copy link
Contributor Author

Original comment by Marcus Geelnard (Opera) on W3C Bugzilla. Tue, 11 Dec 2012 09:08:04 GMT

(In reply to comment #9)

Certainly not! :)
No, just refering to curves.

As for the other other parameters, it would be handy to have a 'better than
linear/ramp' interpolator that can be switched on or off.
But (AA)filtering would only be necessary when oversampling, as can be the
case with curves.

Well, I'm pretty sure that a mathematical exponential ramp exhibits an infinite frequency spectrum (i.e. requires an infinite number of Fourier terms to reconstruct properly), and that just sampling it without any filtering will indeed result in aliasing. This is also true for a linear ramp, or even a simple setValueAtTime.

That's analogous to what would happen if you implemented the Oscillator node with just trivial mathematical functions (such as using the modulo operator to implement a saw wave).

I guess that my point is: Do we really have to care about aliasing filters for AudioParams at all? It would make things much more complicated. If you really want to do things like Nyquist-correct sampling of a custom curve, you can use a AudioBufferSourceNode as the input to an AudioParam instead.

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Tue, 11 Dec 2012 16:32:55 GMT

(In reply to comment #10)

Well, I'm pretty sure that a mathematical exponential ramp exhibits an
infinite frequency spectrum (i.e. requires an infinite number of Fourier
terms to reconstruct properly), and that just sampling it without any
filtering will indeed result in aliasing. This is also true for a linear
ramp, or even a simple setValueAtTime.

That's analogous to what would happen if you implemented the Oscillator node
with just trivial mathematical functions (such as using the modulo operator
to implement a saw wave).

I guess that my point is: Do we really have to care about aliasing filters
for AudioParams at all? It would make things much more complicated. If you
really want to do things like Nyquist-correct sampling of a custom curve,
you can use a AudioBufferSourceNode as the input to an AudioParam instead.

Well, the problem is kindof that there will be very different requirements depending on what the audioParam is controling.
Sometimes nearest neighbour is what you want, sometimes it's linear interpolation and sometimes you just need a properly antialiassed signal or just a smooth signal.
If i control the the pitch of an oscillator with note data then i usually don't care about aliasing. In this case i want no filtering at all.
But if i control the volume of a gain node with a function and it alows me to use a variable rate sample then i would expect that sample to be properly aa'ed at any rate.

I'm not sure a filter would be that much more complicated. There is already a filter active on the setValue method of AudioParams.
The curves capability allows people to use vastly unsuitable material (complex sound played above samplerate) as input to the AudioParam.
Using AudioBufferSourceNode may be problematic with scheduling and there is already a problem with the SetValue as it always applies a heavy filter to changing values in the current implementations. I believe that is the reason the Curve was invented in the first place.
So why not just make it right instead of expecting the user to know/find out in what ways it is broken?

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Tue, 11 Dec 2012 18:26:55 GMT

Here's my take:

AudioParam already has several ways of being controlled:

  1. linear and exponential ramps: these are very well established ways of controlling parameters going back decades in computer music systems and synth design. Furthermore, these can be a-rate, so much smoother than systems which work exclusively using k-rate.

  2. arbitrary curves: These are mostly useful as slowly varying curves, but can be oversampled to a very high degree (2x, 4x, 8x) by providing a Float32Array which is far longer than the duration of the curve. I don't think we should be worried about memory performance here since these will still generally be much smaller than the audio assets themselves. This oversampling can help to a great degree to band-limit the signal.

  3. control via audio-rate signals: These signals can be band-limited to the extent that source node, and the processing nodes use band-limited approaches.

Especially with (3) we have a pretty rich possibility of controlling the parameters, including ways which are concerned about band-limited signals.

This does bring up other areas of the API which need to be concerned with aliasing:

AudioBufferSourceNode: currently its interpolation method is unspecified. WebKit uses linear interpolation, but cubic, and higher order methods could be specified using an attribute.

OscillatorNode: once again the quality could be controlled via attribute. WebKit currently implements a fairly high-quality interpolation here

WaveShaperNode: there are two aspects of interest here:

  1. How is the .curve attribute sampled? Currently the spec defines this as "drop sample" interpolation and not even linear, but we should consider the option of linear. I'm concerned about this one because I notice people are using the WaveShaperNode for distortion with relatively small curves (such as 8192 in tuna.js) which will end up not only shaping the signal, but adding a bit-crushing/bit-decimation effect, which may or may not be the effect wanted)

  2. Is the wave-shaping curve applied at the AudioContext sample-rate, or is the signal first up-sampled to a higher sample-rate to avoid aliasing?? The option to have band-limited wave-shaping will become more and more important with the advent of applications like guitar amp simulations. Aliasing can seriously affect the quality of the distortion sound. We know people are interested in these kind of applications, since they're already showing up:
    (Stuart Memo's work, tuna.js, and http://dashersw.github.com/pedalboard.js/demo/)

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Tue, 11 Dec 2012 19:34:44 GMT

(In reply to comment #12)

Here's my take:

AudioParam already has several ways of being controlled:

  1. linear and exponential ramps: these are very well established ways of
    controlling parameters going back decades in computer music systems and
    synth design. Furthermore, these can be a-rate, so much smoother than
    systems which work exclusively using k-rate.

  2. arbitrary curves: These are mostly useful as slowly varying curves, but
    can be oversampled to a very high degree (2x, 4x, 8x) by providing a
    Float32Array which is far longer than the duration of the curve. I don't
    think we should be worried about memory performance here since these will
    still generally be much smaller than the audio assets themselves. This
    oversampling can help to a great degree to band-limit the signal.

I'd agree with this except that it may not be clear for the user that the data should be sufficiently smooth for it to be rendered at higher speeds without artefacts.
While its easy to understand that you can use longer datas to get more precision my guess is that most people won't understand enough of audio to know why their custom curve sounds bad at higher speeds.

  1. control via audio-rate signals: These signals can be band-limited to the
    extent that source node, and the processing nodes use band-limited
    approaches.

You forgot case 4): directly setting the value without any interpolation.

Especially with (3) we have a pretty rich possibility of controlling the
parameters, including ways which are concerned about band-limited signals.

But it would be intensive to create a flexible envelope generator in js that would generate samples at audio rate.

Usually an envelope consists of several segments of functions that are controlled independantly.
What you want is to be able to glue these segments together at different rates.
So a decay segment could be the same series of samples as the release but at a different rate.
If you were to use a sample generator you would have to calculate the length of these segments at the desired speed before you can schedule the segments.
You would also have to somehow cascade several of these generators and switch between them. Getting this stuff right will not be fun and it would be more clear if you could just use the curve function to map a curve sample to a specified time.

This does bring up other areas of the API which need to be concerned with
aliasing:

AudioBufferSourceNode: currently its interpolation method is unspecified.
WebKit uses linear interpolation, but cubic, and higher order methods could
be specified using an attribute.

For samples i'd suggest a FIR filter with a SinC kernel if you implement anything more fancy than linear.
Such a filter would be usable in both the oversampled and undersampled case.
Cubic would only sparsingly be appropriate and then mostly if the original data represents an already smooth function. It will certainly get you in trouble with periodic signals of mid to high frequency.

OscillatorNode: once again the quality could be controlled via attribute.
WebKit currently implements a fairly high-quality interpolation here

Do you mean the .frequency parameter?
If so, then the interpolation here is almost useless without a way to control the rate of change. Now you need to circumvent the interpolation to get a straight note out of the oscillator.

WaveShaperNode: there are two aspects of interest here:

  1. How is the .curve attribute sampled? Currently the spec defines this as
    "drop sample" interpolation and not even linear, but we should consider the
    option of linear. I'm concerned about this one because I notice people are
    using the WaveShaperNode for distortion with relatively small curves (such
    as 8192 in tuna.js) which will end up not only shaping the signal, but
    adding a bit-crushing/bit-decimation effect, which may or may not be the
    effect wanted)

I agree that short curves will lead to extra degradation.
But both the interpolated and non-interpolated cases are interesting from a musical point of view. In other words, it would be realy cool if there was interpolation but it needs to be optional.
The objective for this interpolator would be to create a smooth curve (without peaks or resonances like, for instance, cubic or sinc would do). The non-linearity of such an interpolator can be welcome in the case of heavy distortion. In fact, the whole point of distortion is to introduce non-linear change of the waveform. A classic clipping distortion is full of alias-like distortion, for instance.
So here the requirements of an interpolator are different than in the case of resampling.
All of this is great for raw sounds.

  1. Is the wave-shaping curve applied at the AudioContext sample-rate, or is
    the signal first up-sampled to a higher sample-rate to avoid aliasing?? The
    option to have band-limited wave-shaping will become more and more important
    with the advent of applications like guitar amp simulations. Aliasing can
    seriously affect the quality of the distortion sound. We know people are
    interested in these kind of applications, since they're already showing up:
    (Stuart Memo's work, tuna.js, and
    http://dashersw.github.com/pedalboard.js/demo/)

It would be super if the algorithm does oversample.
This would allow especially subtle use of the wave-shaper.
Possible problem is that you may have to oversample a couple of times to get a proper aa'ed distortion.

@olivierthereaux
Copy link
Contributor Author

Original comment by redman on W3C Bugzilla. Tue, 11 Dec 2012 19:38:41 GMT

Is there anything against the idea of having separate interpolator objects?
I start to realize that a lot of the time you want to make a choice if and how you want to interpolate. So why not have interpolation as a separate module that you can plug in front of an AudioParam.
Or is this a crazy idea? :)

@olivierthereaux
Copy link
Contributor Author

Original comment by Chris Rogers on W3C Bugzilla. Tue, 11 Dec 2012 21:21:20 GMT

(In reply to comment #14)

Is there anything against the idea of having separate interpolator objects?
I start to realize that a lot of the time you want to make a choice if and
how you want to interpolate. So why not have interpolation as a separate
module that you can plug in front of an AudioParam.
Or is this a crazy idea? :)

I don't think that there's any simple way to generically abstract an interpolator object to work at the level of the modules in the Web Audio API. Marcus has suggested an approach which is very much lower-level with his math library, but that's assuming a processing model which is very much different than the "fire and forget" model we have here.

Even if there were a way to simply create an interpolator object and somehow attach it to nodes (which I don't think there is), I think that for the 99.99% case developers don't want to have to worry about such low-level details for such things as "play sound now". I've tried to design the AudioNodes such that they all have reasonable default behavior, trading off quality versus performance. An attribute for interpolation quality seems like a simple way to extend the default behavior, without requiring developers to deal with interpolator objects all the time.

@joeberkovitz
Copy link
Contributor

It's unclear what is the state of play for the original problem of under/mis-specification for the interpolation. It looks as though the language has been somewhat cleared up but the definition of "scaled to fit the desired duration" still seems fuzzy.

@mdjp
Copy link
Member

mdjp commented Oct 28, 2014

TPAC RESOLUTION: Spec to clarify linear interpolation. If other interpolation requires a feature request is required.

@mdjp mdjp added the Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/ label Oct 28, 2014
@cwilso cwilso modified the milestone: Web Audio Last Call 1 Oct 29, 2014
@rtoy
Copy link
Member

rtoy commented Jun 19, 2015

Oops. This issue was the same as #547. Except in #547, we decided to use nearest instead of linear.

What should we do? I am fine with either nearest or linear interpolation.

@padenot
Copy link
Member

padenot commented Jun 21, 2015

Doing an interpolation seems more useful. You can do an array with step functions is you want steps.

padenot added a commit that referenced this issue Aug 27, 2015
Fix #131: specify linear interpolation for setValueCurveAtTime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Edits Decision has been made, the issue can be fixed. https://speced.github.io/spec-maintenance/about/
Projects
None yet
Development

No branches or pull requests

6 participants