Skip to content

Fix AudioStream catch-up clicks: add cooldown + crossfade#260

Merged
MaxHeimbrock merged 7 commits intomainfrom
max/audiostream-catchup-fix
Apr 28, 2026
Merged

Fix AudioStream catch-up clicks: add cooldown + crossfade#260
MaxHeimbrock merged 7 commits intomainfrom
max/audiostream-catchup-fix

Conversation

@MaxHeimbrock
Copy link
Copy Markdown
Contributor

@MaxHeimbrock MaxHeimbrock commented Apr 22, 2026

Summary

  • AudioStream's drift-correction path was firing on almost every audio callback and skipping samples via RingBuffer.SkipRead, which is a raw pointer move. The resulting step discontinuity produces a click/pop train under sustained tones.
  • Add a cooldown (SkipCooldownCallbacks = 10) so we never skip back-to-back — that is what turns a single correction into an audible artifact train.
  • When the skip does fire, read a 128-frame (~2.7 ms @ 48 kHz) post-skip window and linearly crossfade it into the tail of the output. The seam becomes a short linear ramp instead of a step — inaudible on voice and music. Guarded with AvailableRead() >= skipBytes + crossfadeBytes so we never trade a catch-up click for an underrun click.

Scoped change: one file (Runtime/Scripts/AudioStream.cs), +49/-10. No public API change, no RingBuffer/Resampler changes.

Test plan

  • Build in Unity 6000.0.49f1 (and 2023.2.20f1 if convenient) — must compile clean.
  • Run Samples~/Basic against livekit-server --dev with a sustained-tone or music publisher. Before the fix: click train within seconds. After: no perceptible artifacts over a multi-minute session.
  • Temporarily log _buffer.AvailableRead() — fill should oscillate in the 30–160 ms band, not park at 100 ms, and should not pin at the HWM.
  • Confirm "AudioStream primed" appears once and no repeated underrun re-prime messages during steady-state playback.
  • Background / foreground cycle still recovers cleanly (existing OnApplicationPause path unchanged).

The drift-correction block in AudioStream.OnAudioRead was tripping on almost
every audio callback and calling RingBuffer.SkipRead, which is a raw pointer
move that produces a step discontinuity in the waveform — audible as a
click/pop train under any sustained tone.

Two causes: HighWaterMarkPercent at 0.50 meant normal jitter (a 30–40 ms burst
of 10 ms WebRTC frames) routinely parked the buffer above the skip threshold,
so the skip fired every callback; and each skip dropped samples without any
crossfade.

- HighWaterMarkPercent 0.50 -> 0.80 so only genuine drift near overflow
  triggers catch-up, not normal jitter.
- Add SkipCooldownCallbacks = 10 so we never back-to-back skip, which is what
  produces the gravelly artifact train.
- When we do skip, read a 128-frame (~2.7 ms @ 48 kHz) post-skip window and
  linearly crossfade it into the tail of the output. The seam becomes a short
  linear ramp instead of a step; inaudible on voice and music.
- Guard with an AvailableRead() >= skipBytes + crossfadeBytes check so we
  never trade a catch-up click for an underrun click.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxHeimbrock MaxHeimbrock force-pushed the max/audiostream-catchup-fix branch from bc69c08 to 6b3f700 Compare April 23, 2026 14:47

// Pre-buffering state to prevent audio underruns
private bool _isPrimed = false;
private const float BufferSizeSeconds = 0.2f; // 200ms ring buffer for all platforms
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should bump up the buffer size to 250ms for now ? to make sure we can handle the drift better ?

// HWM at 50% (100ms of 200ms) so normal network jitter does not trip catch-up.
// Cooldown prevents back-to-back skips, which sound like a gravelly click train;
// one occasional skip is inaudible thanks to the crossfade in OnAudioRead.
private const float HighWaterMarkPercent = 0.50f;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is still 50% for the HIghWatermarkPercent ?

// Cooldown prevents back-to-back skips, which sound like a gravelly click train;
// one occasional skip is inaudible thanks to the crossfade in OnAudioRead.
private const float HighWaterMarkPercent = 0.50f;
private const float SkipPerCallbackPercent = 0.05f;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd better re-visit the logic, does skipping 5% of audio noticable ?

if it is noticeable, we'd better re-consider the approach. Like will it work better if we just skip initial 50ms of audio in the buffer so that we will have a continuous audio stream later ? or should we increase the buffer size ?

@MaxHeimbrock MaxHeimbrock changed the title Fix AudioStream catch-up clicks: raise HWM, add cooldown + crossfade Fix AudioStream catch-up clicks: add cooldown + crossfade Apr 27, 2026
Copy link
Copy Markdown
Contributor

@xianshijing-lk xianshijing-lk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, we can ship the current version though it is slightly more complicated than I would expect.

@MaxHeimbrock MaxHeimbrock merged commit 68e3dd4 into main Apr 28, 2026
15 checks passed
@MaxHeimbrock MaxHeimbrock deleted the max/audiostream-catchup-fix branch April 28, 2026 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants