Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify ABSN playback algorithm exactly. #1143

Merged
merged 20 commits into from Apr 11, 2017
Merged

Specify ABSN playback algorithm exactly. #1143

merged 20 commits into from Apr 11, 2017

Conversation

joeberkovitz
Copy link
Contributor

Fix #95

index.html Outdated
instance, playing back in rhythmically-perfect ways. If
sample-accurate playback of network- or disk-backed assets is
instance, playing back musical events that are rhythmically perfect.
If sample-accurate playback of network- or disk-backed assets is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relevant to the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was since the spec already used the word "rhythmically" then we must already be talking about music, and the phrase seemed to be an example rather than spec language.

I can remove this language to talk about accurate timing instead.

index.html Outdated
the buffer. If <code>loopEnd</code> is less than or equal to 0,
or if <code>loopEnd</code> is greater than the duration of the
buffer, looping will end at the end of the buffer. The loop
endpoint within the buffer is determined by multiplying by the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiply what by the buffer's sample rate? Something is missing here. Plus, I think we don't really care anymore what frame it translates to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is some legacy language. I think we can just talk about it as an offset from the start of the buffer content in seconds. We don't need to talk about multiplication here since the algorithm makes that clear later.

index.html Outdated
<code>actualLoopStart</code> for interpolation purposes.
</li>
<li>Let the operation <em>optimize the buffer</em> be any operation
that alters both the buffer contents and sample rate in a way that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to imply that the ABSN buffer is actually modified. I think that's not what we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this to explicitly state that modification doesn't happen

index.html Outdated
<li>Let the operation <em>optimize the buffer</em> be any operation
that alters both the buffer contents and sample rate in a way that
increases the efficiency or quality of rendering, while minimizing
changes to the <code>playbackSignal()</code> function. The nature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think all of this optimization and such can be left to playbackSignal function to optimize in whatever way is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps. My motivation is simply to capture the fact that buffer resampling can occur at certain times as an optimization. We could simply allow playbackSignal() to employ buffer resampling at will, along with the other UA-determined interpolation that it already does.

Would that work from your point of view?

index.html Outdated
might include upsampling, downsampling, applying a subsample
offset, or loop unrolling. This operation may affect the values of
<code>sampleRate</code>, <code>playbackRate</code> and
<code>offset</code>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think such optimizations should modify any of these values. They are what they are and optimization should not change the values, as viewed by this algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right about sampleRate and offset. But if you resample a buffer, then its sample rate changes, doesn't it?

Maybe this is moot though if we just fuzz out the whole desription of optimization and let the UA do whatever it wants so long as it's effectively invisible to the outcome.

index.html Outdated
</li>
</ol>
</li>
<li>Increase <code>bufferTime</code> by <code>dt *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is at the wrong level and should be in the Else clause. Otherwise bufferTime gets incremented even if currentTime < start.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I will fix this!

index.html Outdated
<li>Let <code>effectiveBufferTime</code> equal
<code>bufferTime</code>.
</li>
<li>While <code>loop</code> is true and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this reads a bit oddly. Maybe have bullet 3 just be the text "map the playhead...", then a new list consisting of

  1. while loop is true and effectiveBufferTime >= actualLoopEnd
    1. effectiveBufferTime -= actualLoopEnd - actualLoopStart

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

index.html Outdated
greater than or equal to <code>actualLoopEnd</code> and there is no
subsequent sample frame in <code>buffer</code>, then the next
sample frame should be taken as the playback signal at
<code>actualLoopStart</code> for interpolation purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this captures what you want. Plus I think it depends on the value of loop. I think the only way position can be past loopEnd is if we're not looping, and then we do "normal" interpolation.

How the interpolation is done when we're looping and near the end (or beginning?) of the loop can be up to the UA? Maybe playbackSignal needs an extra parameter to specify if we're looping or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you take a look at the diagrams that follow (especially the one with half-speed playback of looped audio)?

It's not a matter of position being past loopEnd; it's a matter of position lying between the last sample frame in the loop and loopEnd . This happens (among other scenarios) when the playback rate is less than unity. This is a lot easier to see graphically.

We can certainly introduce something algorithmic to detect this case -- I used to have such language, but I thought you felt it was too complicated. It is difficult to make this language hard-edged and still leave the interpolation algorithm up to the UA.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand your point and agree. The only issue I have now is that this kind of seems to imply 2-point (linear) interpolation because you only say the next point is from loopStart.

Maybe good enough if you just said the next sample frames are taken from actualLoopStart?

index.html Outdated
* pow(2, detune / 1200) * (bufferRate / contextRate)</code> .
This combines the two k-rate parameters affecting playback rate
into a single computed quantity, and also corrects for any
difference in sample rate between the buffer and the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want to include bufferRate/contextRate here. If the UA actually did internally resample the buffer to the contextRate, then this computedPlaybackRate will produce the wrong values (I think) for the following algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the preceding comment, if the UA internally resamples the buffer, then I think both playbackRate and bufferRate should be adjusted so that computedPlaybackRate remains constant. That was why I suggested that optimization could affect these quantities.

If we don't include this ratio here, then how can we account for the fact that the effective playback rate must be unequal to 1 (and require interpolation) when the buffer rate and the sample rate are not identical?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the language more carefully I realize that you're right, this ratio shouldn't be included here.

…and offset parameter descriptions, eliminating legacy language.
Remove unused and confusing bufferRate quantity from algorithm.
Rework confusing wording.
@joeberkovitz
Copy link
Contributor Author

@rtoy I've attempted to address your comments except for the "fake loop end frame" language, since I'm still not sure how to deal with that best.

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still just reviewing the algorithm; the examples haven't been reviewed yet.

@@ -4709,6 +4708,18 @@ <h2 id="AudioBufferSourceNode">
This node has no <a>tail-time</a> reference.
</p>
<p>
A <dfn>playhead position</dfn> for an <a>AudioBufferSourceNode</a> is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be in a weird place. It seems that it's really only relevant to the playback section and might be better to move the paragraph there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I moved it here because it's now referenced in the definitions for loopStart, loopEnd and offset (which allowed me to remove some clumsy language for those parameters that you had commented on). Make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still find this odd to have this here. As mentioned below, we don't say anything about multiplying by the sample rate and loopStart and loopEnd are just values in seconds indexing into the buffer. I'd prefer this paragraph in the the playback section.

WDYT @padenot

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll wait to see what Paul says, but I clarified the language somewhat.

@@ -4778,41 +4789,28 @@ <h2 id="AudioBufferSourceNode">
</dt>
<dd>
<p>
An optional value in seconds where looping should begin if the
<code>loop</code> attribute is true. Its default
An optional <a>playhead position</a> where looping should begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see anything wrong with the original text and don't think it contradicts in any way the new text. I don't find the playhead position adding any additional info. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allowed me to remove several instances of language having to do with multiplying by the sample rate, which you didn't like -- see 73a9fd6 at line 4802

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to multiply by the sample rate anywhere anymore, right? We can just delete that part about multiplication and don't need this at all. I just find suddenly discussing playhead position in the intro to the node odd.

<code>offset</code> is negative.</span> If <code>offset</code>
is greater than <code>loopEnd</code>, playback will begin at
The <dfn id="dfn-offset">offset</dfn> parameter supplies a
<a>playhead position</a> where playback will begin. If 0 is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIkewise, I think the original text is fine.

index.html Outdated
can vary dynamically during playback:
</p>
<ul>
<li>A starting offset, which can be expressed in terms of fractions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The starting offset in the algorithm is not expressed in fractions of a sample. All time quantities are just pure time quantities in seconds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, yes the wording here is messed up.

at any desired point to increase the efficiency or quality of the
output.
</li>
<li>Sub-sample start offsets or loop points may require additional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this really needs to be specified. The algorithm below basically implies some kind of interpolation because the playback head is not guaranteed to be on a sample boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel the entire set of bullet points should go? I'm not sure exactly how much you would like to remove.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. I personally don't see a need to allow explicitly this optimization. The result just has to sound reasonably good and match the algorithm as specified except for whatever interpolation the UA needs to do (or not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took it out.

index.html Outdated
</p>
<ol>
<li>Let <code>start</code> correspond to the <code>when</code>
parameter to <code>start()</code> if supplied, else 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial: nice to have "when" and "start()" be hyperlinks to the actual definitions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this and the next

index.html Outdated
<li>Let <code>stop</code> correspond to the sum of the
<code>when</code> and <code>duration</code> parameters to
<code>start()</code> if supplied, or the <code>when</code>
parameter to <code>stop()</code>, otherwise <em>Infinity</em>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise: hyperlinks to when, duration, start(), and stop() would be nice.

index.html Outdated
<li>Let <code>currentTime</code> be the current time of the <a>
AudioContext</a>.
</li>
<li>Let <code>dt</code> be 1 / (_context sample rate_).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably didn't really mean to write "context sample rate". In any case, it needs to be written in some slightly different way.

index.html Outdated
frame. The initial value -1 indicates that this position has not
yet been determined.
</li>
<li>For each block of audio to be rendered:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be easier to understand (and probably much less verbose) if we used JS to implement this algorithm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably remember that I expressed it in JS originally; then I reworked it as a narrative. There were a few main reasons I did that:

  • I found it necessary to package sample frames from playbackSignal() as arrays, which was awkward and resulted in various for loops that did not add clarity.

  • playbackSignal() itself could not be defined in JS beyond its signature, which felt strange to me.

  • Most of the other algorithmic descriptions in the spec are expressed in this form, not in JS (although some of them would maybe be clearer as pure code, too).

Anyway, I don't feel super strongly about the issue, those were just my reasons. I could certainly take it back to JS if you and @padenot feel it's best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave the decision to you. I just thought code would be easier to read and understand at this point because there's nothing that can't be easily expressed in JS.

I'm ok with playbackSignal() not being in JS. I'm ok if you added, say, output(sampleFrame,index) to output magically the output of playbackSignal() at the given index. Perhaps that's not appropriate for a spec; I don't know.

@joeberkovitz
Copy link
Contributor Author

I've further cleaned up the language in response to @rtoy's last review and removed mention of buffer optimization altogether.

At this point I'd love to get some final feedback including the example diagrams. @padenot in particular...

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments. This looks quite good and I only have a concern for negative playback and a minor concern about location of the playback position paragraph.

I'm looking at the graphs and text next in a separate review.

index.html Outdated
greater than or equal to <code>actualLoopEnd</code> and there is no
subsequent sample frame in <code>buffer</code>, then the next
sample frame should be taken as the playback signal at
<code>actualLoopStart</code> for interpolation purposes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I understand your point and agree. The only issue I have now is that this kind of seems to imply 2-point (linear) interpolation because you only say the next point is from loopStart.

Maybe good enough if you just said the next sample frames are taken from actualLoopStart?

@@ -4709,6 +4708,18 @@ <h2 id="AudioBufferSourceNode">
This node has no <a>tail-time</a> reference.
</p>
<p>
A <dfn>playhead position</dfn> for an <a>AudioBufferSourceNode</a> is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still find this odd to have this here. As mentioned below, we don't say anything about multiplying by the sample rate and loopStart and loopEnd are just values in seconds indexing into the buffer. I'd prefer this paragraph in the the playback section.

WDYT @padenot

index.html Outdated
<code>currentTime</code> attribute of the node's
<a>AudioContext</a>.
</li>
<li>Let <code>contextRate</code> be the value of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why give the sample rate the new name contextRate? We don't refer to the sample rate of the buffer anymore, so there's no chance of confusion in this algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it can be called sampleRate now.

index.html Outdated
<li>map the playhead position to the correct point
within the loop body accounting for loop iterations:
<ol>
<li>While <code>loop</code> is true and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this loop needs to be adjusted if the computed playback rate is negative. Wouldn't the actualLoopStart and actualLoopEnd be effectively swapped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no, they wouldn't be swapped.

The while loop has the effect of mapping any playhead position past the end of the loop to an equivalent position that actually lies within the body of the loop. The test for past the end of the loop is simply a matter of comparison with actualLoopEnd and is independent of the direction in which bufferTime progresses.

Negative playback rates in this algorithm are effected simply by decreasing bufferTime on each loop iteration (I have clarified the language on that point separately, as the algorithm used to say "increasing" despite the fact that computedPlaybackRate could be negative). No other elements of the algorithm are needed to handle reverse playback -- we just move the playhead position backwards and let interpolation do all the other work.

index.html Outdated
<code>actualLoopEnd</code>,
<ol>
<li>Let <code>effectiveBufferTime</code> equal
<code>effectiveBufferTime - (actualLoopEnd -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIkewise, for negative playback rate, I think we want to increment effectiveBufferTime instead of decrement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as point above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, bufferTime is already decremented in the outer part of the loop for negative PB rates (despite the fact that the language used to say "Increase"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, actualLoopStart is always less than actualLoopEnd, per https://rawgit.com/WebAudio/web-audio-api/478015ea3953ae08a4abc237fd985deba37146ce/index.html#looping-AudioBufferSourceNode, for sensible values of loopStart and loopEnd.

Assume offset is zero. Then the very first time bufferTime will get updated to some negative value if the playbackRate is negative. This is fine. Then you set effectiveBufferTime to bufferTime. You while-loop keeps subtracting actualLoopEnd - actualLoopStart which is always positive. So effectiveBufferTime keeps decreasing and get farther away from actualLoopEnd. The loop never terminates.

What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The while-loop in this case won't do anything at all, because the loop condition says "While loop is true and effectiveBufferTime >= actualLoopEnd". Obviously the latter condition won't be satisfied. The loop only runs for bufferTime values that are after the end of the loop.

Separately, you have identified an issue with negative values of bufferTime that I need to rectify: I tried to use a negative value to mean "playback not started", but obviously I'll need a separate started flag or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To further clarify and maybe avoid the next question, here's an assertion: if effectiveBufferTime is less than actualLoopEnd, then one of the following cases is true:

  1. The playhead is within the body of the loop. No need to do anything, because the purpose of the loop is to map positions in non-initial loop iterations into the equivalent positions within the initial iteration of the loop body, where they can be looked up and interpolated.

  2. The playhead is prior to the body of the loop. No need to do anything, because the loop is not involved.

  3. The playhead is prior to the buffer altogether (i.e. negative). No need to do anything, because silence will be output regardless.

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really appreciate you work on the graphs. They illustrate things nicely and are a ton of work.

index.html Outdated
<li>buffer samples are shown with the first sample frame at the
<em>x</em> origin.
</li>
<li>output samples are shown with the <code>start</code> context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand what this sentence is trying to tell me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, that got super mangled. Have cleared it up.

index.html Outdated
</li>
</ul>
<p>
Basic playback with a simple loop that ends after the last sample
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial: This isn't a complete sentence. Maybe it should be the caption of the first figure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I tried using these long phrases as captions but I felt they did not format very clearly, which is why the captions are now very brief and this text precedes each figure.

index.html Outdated
</figcaption>
</figure>
<p>
<code>playbackRate</code> interpolation, illustrating half-speed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial: not a complete sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

index.html Outdated
<code>playbackRate</code> interpolation, illustrating half-speed
playback of buffer contents in which every other output sample
frame is interpolated. Of particular note is the last sample frame
in the looped output, which is interpolated using the loop start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a note (or something) that this is an example assuming linear interpolation for illustration purposes and is not normative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points. I moved them to the preamble that goes in front of all the figures, since they are applicable to every diagram.

index.html Outdated
</figcaption>
</figure>
<p>
Sample rate interpolation, illustrating playback of a buffer whose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this diagram and text is needed. The algorithm is specified in terms of time in the buffer instead of samples, so it's pretty clear that some kind of interpolation is needed if the rates don't match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to err on the side of caution. Perhaps let's see if we get other feedback. @padenot thoughts?

index.html Outdated
interact with looped output, showing that reversed playback is an
exact reversal of what would occur with forward playback including
the pre-loop portion of the buffer followed by a specific number of
loop iterations:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have been really nice if the looped section wasn't symmetrical to make the time reversal clearer in the graphs. This is a ton of work so I'm not going to make you do this. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the symmetrical waveform was a mistake... Thanks for letting me off the hook :-)

index.html Outdated
In all figures, the following conventions apply:
</p>
<ul>
<li>context sample rate is 1000 Hz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the math and diagrams would be slightly easier if the context rate was 1 Hz and not 1000 Hz. Or maybe specify the times and stuff in the diagrams using ms units instead of sec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here was my reasoning: I wanted to make sure it was evident where we were using time instead of sample offsets, and my concern with milliseconds was that it's not a unit that we actually employ anywhere.

index.html Outdated
Subsample loop playback, showing how fractional frame offsets in
the loop endpoints map to interpolated data points in the buffer
that respect these offsets as if they were references to exact
sample frames:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice example. Would it be possible to have horizontal grid lines to show more clearly that the looped section isn't a copy? Hmm. I'm assuming that the second rep of the loop isn't exactly identical with the first pass through. I think the algorithm says they shouldn't be exactly the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the looped sections are suppsed to be identical copies although an editing error displaced one of the repeats vertically by mistake. I don't know why the repeats would be different in any way since the loop length is an exact multiple of the sample interval.

I did discover that the samples were marked wrongly in this diagram as interpolated vs. exact, and fixed that. I also updated all the diagrams to show the same distinction in the output signal (not just in the buffer contents) which I feel makes it all clearer.

@rtoy
Copy link
Member

rtoy commented Feb 22, 2017 via email

@joeberkovitz
Copy link
Contributor Author

So if I have a loop playing now (with playbackRate = 1) and I'm in the
middle of the loop, and then change playbackRate to -1, the playhead will
keep decrementing past the loop start and then continue to the beginning of
the buffer and then stop?

It depends -- your phrasing of the question doesn't fully expose the speced behavior.

Let's say you start an ABSN node with playbackRate = 1. The attack part of the sound plays, followed by N iterations of the loop (N obviously depends on how long the sound plays for).

Now set playbackRate to -1, in the middle of the N+1th iteration. The exact output up to this moment will now play in reverse. Part of the N+1th loop iteration will play backwards, followed by N full loops played backwards, followed by the attack played backwards. The diagram for reversed loop playback illustrates this behavior.

Thus, your statement isn't true unless one reverses playback during the very first loop iteration.

I think this is in fact what people want. In my work using sampler based synthesis, I am not aware of cases where infinite reversed looped playback is essential. Ask to play an audio event backwards, and you hear the exact reverse of it playing forwards, including attack phases etc. -- looping should not change that, in my opinion. But of course I would love to hear objections that this fails to satisfy some use case or other.

@joeberkovitz
Copy link
Contributor Author

joeberkovitz commented Feb 22, 2017

Here's a bit more followup, after having some more time to think about your question.

  • The proposed behavior causes a looped buffer to behave identically to a non-looped buffer which instantiates the loop body as a literal, infinite sequence of loop repeats. This feels consistent to me.

  • With this behavior, one can still achieve virtually indefinite reversed looping, by starting the node with a very large offset value (thus preventing the playhead from reaching zero, given some reasonable upper limit on duration).

  • Should we wish the loop body to "trap" the playhead for reverse playback purposes as you are suggesting, it is not hard to change the algorithm; we would simply add a loopStarted flag to track the playhead entering the loop body, plus a while-loop conditioned on this same flag which "folds forward" into the loop body when bufferTime < actualLoopStart -- very similar to the existing while-loop, but mapping in the other direction. However I'm still not convinced that this is the most consistent or useful behavior.

  • As a corollary to the above, the behavior you've proposed will introduce some anomalies for playhead positions prior to loopStart, including negative positions. Sometimes these will emit the "attack" phase of the sound or (for negative positions) silence; at other times, after the loop has been entered, these will emit part of the loop body. I don't like the idea of the same playhead position mapping to different sounds depending on the node's state.

@rtoy
Copy link
Member

rtoy commented Feb 23, 2017 via email

@joeberkovitz
Copy link
Contributor Author

Step 9.6.2.5 keeps decrementing bufferTime. That's fine. And since
bufferTime was in the middle of the loop, bufferTime < actualLoopEnd.
Then step 9.6.3.1 is never true because effectiveBufferTime < actualLoopEnd. So
9.6.2.5 takes effect, making bufferTime decrease more.

With you so far.

So eventually bufferTime < 0, causing an error (presumably). We need to do
something about that at least.

Correct, but I don't think this condition should cause an error, nor does it have anything to do with looping. The intent was that playbackSignal() should return silence if bufferTime is outside the bounds of the buffer (i.e. is negative, or greater than the duration of the buffer). Somehow this language got left out during the various edits that took place. I will fix that.

I think we agree that if you're looping and the playhead is in the looped
portion and you set the rate to negative, the looped portion continues
looping "forever", in reverse. I just don't see how the algorithm produces
that.

I don't agree with that (at least, not yet, for the reasons I outlined above), and indeed the current algorithm doesn't produce it :-) However I explained in #1143 (comment) how the algorithm can be adapted to do so, should we decide to make this change.

I feel we need a use case based argument that this is a good thing.

Oh, and I also see in 9.6, "length of the audio block to be rendered" seems
to be undefined. Not sure what you're trying to say here.

I was trying to say 128.

@rtoy
Copy link
Member

rtoy commented Feb 23, 2017 via email

@joeberkovitz
Copy link
Contributor Author

​That bit about "followed by N full loop​ played backwards" means something
different from what I thought it meant.

And I no longer understand what Fig 10 is showing where the playback rate
is -1 and three loops are shown. Does a negative playback rate actually
cause the loop to be looped?

Yes, in Fig 10 the loop does indeed get looped (and the content of the loop is backwards, although this is unfortunately not obvious due to my poor choice of waveform). The reversed loop keeps playing until the playhead reaches actualLoopStart, then the attack phase of the buffer is played backwards, followed by silence. So it's the exact reverse of what you get if you play the buffer forwards from offset=0 and hear silence, then the attack phase, then 3 loop iterations.

In the absence of a use-based counterargument I continue to think that playbackRate=-1, offset=T, duration=T should yield the exact reverse of playbackRate=1, offset=0, duration=T.

​I thought we were trying to get away from saying anything about render
quanta or anything like that. (That was one thing I didn't like about
Paul's original proposal: too closely tied to the internal implementations.)

We are indeed - my answer was a bit tongue-in-cheek.

The current phrasing "length of the block to be rendered" was an attempt to get away from a literal number. What other phrase should I use instead?

@rtoy
Copy link
Member

rtoy commented Mar 1, 2017 via email

@joeberkovitz
Copy link
Contributor Author

I wrote:

In the absence of a use-based counterargument I continue to think that
playbackRate=-1, offset=T, duration=T should yield the exact reverse of
playbackRate=1, offset=0, duration=T.

@rtoy said:

​I think this is what we want. I don't think the algorithm produces this.​

That's good - then we agree on the goal. Can you help me understand where the algorithm goes wrong? If playbackRate=1, offset=0, duration=T produces silence, forward attack and 3 forward loop iterations and playbackRate=-1, offset=T, duration=T produces 3 reversed loop iterations, reversed attack and silence, then isn't that achieving the goal?

I said:

The current phrasing "length of the block to be rendered" was an attempt
to get away from a literal number. What other phrase should I use instead?

@rtoy said:

​I don't think we need to limit this in anyway. Let the algorithm run
"forever". Or at least until playback is stopped either because stop() is
called or because the duration has been reached or looping has been turned
off and we've reached the end (or beginning) of the buffer.
But actually, we can just output silence forever too.​

The render-quantum block limit is there because the algorithm is described in terms of what the UA would do during any given render quantum, inside step 9 ("For each block of audio to be rendered..."). There are potentially an unlimited number of blocks, so the algorithm does indeed run "forever".

So it seems to me we don't disagree about this either.

@rtoy
Copy link
Member

rtoy commented Mar 1, 2017 via email

…y when the loop has been entered at least once.
@joeberkovitz
Copy link
Contributor Author

I've added language that supports @rtoy's use case for iterating backwards (infinitely) through a loop, at the expense of exactly reversing the node's output up to the current time (because now, a loop will "trap" backwards playback indefinitely).

I'm not sure this is correct but perhaps it's a useful foil for discussion.

@rtoy
Copy link
Member

rtoy commented Mar 15, 2017

index.html Outdated
function process(output) {
var dt = 1 / context.sampleRate;
var index = 0;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to initialize currentTime variable here from context.currentTime.

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for converting this to Javascript. It is soooo much easier to read and understand now.

index.html Outdated
var dt;

// Handle invocation of start method call
function handleStart(when, pos, duration) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this. It's not called anywhere so it's not clear what this is for. Well, I know you intend to use this to set start, offset, and stop, but perhaps you can just use comments to specify how start, offset and stop are initialized. I don't expect the code to be a full and complete implementation of ABSN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

}

// Handle invocation of stop method call
function handleStop(when) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this too, for the same reason as handleStart().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

index.html Outdated
// Render a single frame of audio into the channel data arrays given by output
// at the given element index. The frame value may be an array of channel signal
// values, or a single value to be rendered to all channels.
function renderFrame(output, index, frame) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we basically have two undefined functions, maybe make playbackSignal work for both? So playbackPosition(position, output, index) will magically produce the right sample frame and place it in output[index]. For the cases were we want 0 output, maybe use playbackPosition(-1, output, index)?

Not sure if this is better, but having fewer imprecisly defined functions is good.

index.html Outdated
// Generate a single render quantum of audio to be placed
// in the channel arrays defined by output.
function process(output) {
var dt = 1 / context.sampleRate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this outside of process since it's fixed and independent of the process loop.

index.html Outdated
}

// Render each sample frame in the quantum
while (index &lt; output[0].length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output[0].length makes certain assumptions about what output is. For this description, just make output be a vector that holds the output. So output.length instead of output[0].length. I think we can handwave all of this away and make playbackSignal do the necessary magic, since it's already magical?

index.html Outdated
}
}

// Interpolate a multi-channel signal value for some sample frame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"frame" -> "time" seems better because the parameter is a time value, not a frame (index) value. And in the comments you use "frame" to mean the sample values.

index.html Outdated
}

// Render a single frame of audio into the channel data arrays given by output
// at the given element index. The frame value may be an array of channel signal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Use "|output|" and "|index|" when referring to the values of the parameters.

May not want to say that "frame value may be an array" because we didn't say what playbackSignal returns.

index.html Outdated
</ol>
<pre>
// The AudioBuffer and context employed by this node
var buffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: I think the new ES6 style is to use "let" instead of "var" unless you really need "var". I don't think we do here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change all the vars.

index.html Outdated
var context;

// The following variables capture attribute and AudioParam values for the node.
// They are updated k-rate, except for buffer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"buffer" isn't defined as a variable below, so maybe it shouldn't be mentioned here at all. (It's defined above.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

index.html Outdated
while (index &lt; output[0].length) {
if (currentTime &lt; start || currentTime &gt;= stop) {
// emit silence for the output frame for the element specified by index
renderFrame(output, index, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce nesting depth, it might be beneficial to have "continue" here so that the else statement below isn't needed? I consider this to be the "exceptional" case, so quick exit allows me to concentrate the the heart of the algorithm.

@joeberkovitz
Copy link
Contributor Author

@rtoy thanks for the great review. will make changes tomorrow

@joeberkovitz
Copy link
Contributor Author

@rtoy PTAL yet again. Thanks!

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of little nits.

With this new algorithm, do the diagrams really add anything? I'm mostly concerned about maintaining them. It looks like a ton of work to maintain.

index.html Outdated
var buffer;
var context;
let buffer; /* AudioBuffer employed by this node */;
let context; /* AudioContext employed by this node */;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Remove ";" at end of line. Maybe just use // style comments here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this went through a couple of iterations and got messed up in the process.

index.html Outdated
}
}
// They are updated on a k-rate basis, prior to each invocation of process().
let loop, detune, loopStart, loopEnd, playbackRate;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know people disagree on this, but I hate the style that declares a bunch of vars all on one line.

I defer to others on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really care -- I usually don't use this style, but thought we wanted more compactness where possible. So happy to make this change.

index.html Outdated
}
// Variables for the node's playback parameters
let start = 0, offset = 0; // Set by start()
let stop = Infinity; // Set by stop(), or by start() with a supplied duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe expand these comments to include your original description (or code) on what the values are. Now it's really unclear what offset and stop are actually set to.

Perhaps I was wrong about removing the functions. Maybe reinstate and do

let start = function_to_initialize_start();
let stop = function_to_initialize_stop();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that was why I had the functions. I will put them back in at this point. However, the use of let here doesn't make sense as start() and stop() can both be called multiple times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with having these functions is that they're never called (or mentioned that they're magically called) anywhere. So they serve no purpose.

So my proposal is to call them somehow either explicitly so implicitly using comments.

Or add comments to where start and stop are declared to explain how these values are initialized.

index.html Outdated
if (bufferTime &gt;= 0 &amp;& bufferTime &lt; buffer.duration) {
output.push(playbackSignal(bufferTime));
}
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Use

} else {

Here and elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure thing

index.html Outdated
else {
renderFrame(output, index, 0);
// Wrap loop iterations as needed
if (enteredLoop) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe convert this to

} else {

to emphasize that this is the alternative case for the if on line 5099.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but it is not the alternative case. The preceding if (!enteredLoop) may set enteredLoop to true.

@joeberkovitz
Copy link
Contributor Author

@joeberkovitz
Copy link
Contributor Author

@rtoy I still feel the diagrams are useful and they are less work to maintain than they were to create. If they become obsolete at some point, someone can take them out.

Copy link
Member

@rtoy rtoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more nits.

The only real issue I have is that handleStart/Stop are defined but never called.

Other than that, I'm fine with this. I encourage others to review this.

index.html Outdated
}
// Variables for the node's playback parameters
let start = 0, offset = 0; // Set by start()
let stop = Infinity; // Set by stop(), or by start() with a supplied duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with having these functions is that they're never called (or mentioned that they're magically called) anywhere. So they serve no purpose.

So my proposal is to call them somehow either explicitly so implicitly using comments.

Or add comments to where start and stop are declared to explain how these values are initialized.

@joeberkovitz
Copy link
Contributor Author

Thanks -- would love to merge this by next week. @padenot as co spec editor please take a look in particular.

handleStart/Stop() are invoked as a consequence of calling start() or stop() on the ABSN itself, much as process() is invoked by the audio graph. I am not sure where any call to these functions could go, since they are called as a consequence of implied messaging between the main and audio threads.

@rtoy
Copy link
Member

rtoy commented Mar 24, 2017 via email

@joeberkovitz
Copy link
Contributor Author

@rtoy Thanks -- I'll wait for further input on this point to avoid going around and around.

@padenot
Copy link
Member

padenot commented Mar 30, 2017

I am not sure where any call to these functions could go, since they are called as a consequence of implied messaging between the main and audio threads.

The spec text for AudioContet.suspend shows how to do this. "Implied" often means that something should be specced :-).

index.html Outdated
<code>start()</code> is called, then playback will continue
If the <code>loop</code> attribute is true when <a href=
"#widl-AudioBufferSourceNode-start-void-double-when-double-offset-double-duration">
<code>start()</code></a> is called, then playback will continue
indefinitely until <code>stop()</code> is called and the stop time
is reached. We'll call this "loop" mode. Playback always starts at
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here, can you fix this to use the third person ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course!

index.html Outdated
playing until it reaches the <em>loopEnd</em> position in the
buffer (or the end of the buffer), at which point it will wrap back
around to the <em>loopStart</em> position in the buffer, and
continue playing according to this pattern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is written with positive playbackRate in mind it seems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, yes.

The effective loop start and end points are required to lie within
the range of zero and the buffer duration, as specified in the
algorithm below. <code>loopEnd</code> is further constrained to be
at or after <code>loopStart</code>. If any of these constraints are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I don't see how this constraint has anything to do with the sign of playbackRate. loopEnd should always be after loopStart.

<p>
Loop-related properties may be varied during playback of the
buffer. The exact results are defined by the playback algorithm
which follows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this paragraph be non-normative ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be normative, or why would we include the algorithm? The algorithm says a lot about how an ABSN behaves.

Perhaps I don't understand your thinking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about being a bit unclear. I meant something along the line of:

We have two ways of speccing the behaviour of the AudioBufferSourceNode, using text and using an algorithm. We should make the text non-normative, and the algorithm normative, so that we have a single authoritative resource to know if a particular behaviour is per spec or not.

The english text and the diagrams are useful to understand the general behaviour of things, and why it is specced like it is, but I don't think it should be normative, since it's open for interpretation and not as precise as the algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks, now I get it. I'll make these changes, it makes sense to me.

index.html Outdated
<p>
The following non-normative figures illustrate the behavior of the
algorithm in assorted key scenarios. Buffer optimization is not
considered, but by definition it does not materially affect the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are "buffer optimizations" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant dynamic resampling of the buffer -- I'll change it to be clearer.

Copy link
Member

@padenot padenot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with most things. I've left some comments.

@joeberkovitz
Copy link
Contributor Author

@padenot I've tried to apply your feedback - PTAL. I didn't think some of it was applicable, so maybe we need to talk further...

@joeberkovitz
Copy link
Contributor Author

@padenot to take the approach of connecting the algorithm with start() and stop(), I felt I had to reproduce the definition of stop() in ABSN although it's really the same as in AudioScheduledSourceNode. If you see another way, please let me know.

@joeberkovitz
Copy link
Contributor Author

@padenot at this point I've applied all the feedback from you that I have. Are we ready to merge?

@padenot padenot merged commit cb40861 into gh-pages Apr 11, 2017
@padenot
Copy link
Member

padenot commented Apr 11, 2017

Yep, thanks ! We can handle any followup in a separate, more focused, issue/PR.

@joeberkovitz joeberkovitz deleted the 95-absn-playback branch April 11, 2017 13:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants