fix(broadcast): micro edge fades on DJ voice clips to kill boundary clicks#830
Merged
Conversation
…licks Some TTS engines cut the WAV hard at the file boundary; once the mic-chain compressor's makeup gain lifts the clip, that hard edge lands as an audible click. Fade each spoken clip in/out over 40 ms — below speech-onset perception, so the first word is untouched. Applied per track on both voice queues, before amplify/mic_chain so the compressor never sees the raw edge. Verified with liquidsoap --check against savonet/liquidsoap:v2.4.4 (the broadcast image base).
perminder-klair
added a commit
that referenced
this pull request
Jul 4, 2026
… kills the whole clip (#837) * fix(broadcast): drop fade.out from voice edge_fade — it silenced every DJ clip fade.out on a request.queue source doesn't know the track's remaining time in this Liquidsoap build, so it treats the whole clip as inside the fade zone and multiplies it to ~0: since #830 every voice segment (links, idents, hourly, weather, request intros) aired as dead air — listeners heard only the duck engaging as a 3-4s volume dip mid-song. Verified empirically against the broadcast image: a sine pushed through the exact chain renders at rms≈13600 plain, ≈13500 with fade.in only, and 0 with the shipped fade.in+fade.out composition. fade.in alone is kept — it still kills the head-boundary click #830 targeted, and is proven harmless. On-air check after deploy: manual /dj/segment station-id shows voice peaks riding the duck instead of silence. * feat(tts): bake 40ms edge fades into rendered voice clips Render time is the only place a clip's true length is known, so the tail fade that can't live in radio.liq (see previous commit) is applied here. audio/wav-edges.ts applies a 40ms linear head+tail ramp in-place to canonical PCM WAVs (16-bit int and 32-bit float); anything else — notably the cloud engine's mp3 output, where encoder padding already avoids edge clicks — is left untouched. Best-effort by design: a clip never fails to air because polish couldn't be applied. Wired into both success paths of tts.speak(), so every engine and every voice kind gets faded edges from the one chokepoint. Verified on rendered on-air WAVs: heads ramp from 0, tails land at 0, speech body untouched.
This was referenced Jul 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a 40 ms fade-in/fade-out to every spoken clip on both DJ voice channels (
voice_queue/intro_queue) inradio.liq, applied per track beforeamplify/mic_chain.Why
The voice WAV/MP3 previously started and ended dry. Some TTS engines cut the file hard at the boundary, and the mic-chain compressor's +7 dB makeup gain amplifies that discontinuity into an audible click/tick at the start or end of a spoken segment. The existing softness (0.8 s
smooth_addduck ramps + silent lead-in) shapes the music around the voice but never touched the voice file's own edges.40 ms is below speech-onset perception, so the first word is not softened; the silent lead-in clip gets faded too, which is harmless.
Performance
Effectively zero:
fade.in/fade.outare amplitude envelopes (one multiply per sample during the 40 ms windows, passthrough otherwise) on a source that is silent most of the time — noise next to the MP3/Opus encoders already running.Verification
liquidsoap --checkpasses againstsavonet/liquidsoap:v2.4.4(the broadcast image base), confirming the fades type-check on therequest.queuesources with no "Early computation of source content-type" error (the failure mode that blocked HPF/EQ here).