Version
Observed on HyperFrames main / @hyperframes/player@0.5.3.
Relevant source:
packages/core/src/runtime/media.ts
packages/core/src/runtime/player.ts
packages/core/src/runtime/init.ts
packages/player/src/hyperframes-player.ts
Summary
Timed <audio data-start> can drift out of sync with the GSAP visual timeline during interactive preview/player playback, especially after repeated pause/play or under degraded performance.
This shows up clearly in speech-driven videos where narration, captions, and visual timing need to stay aligned.
Expected behavior
For narration, captions, and speech-driven videos, audio and visual timeline should stay within a tight A/V tolerance, roughly 50-80ms.
Repeated pause/play should not accumulate persistent drift.
Actual behavior
Runtime-owned audio can remain hundreds of milliseconds out of sync. The current runtime deliberately does not correct sub-500ms steady-state drift, and stable drift below 3s may persist.
This is easy to reproduce by repeatedly pausing and playing a composition with timed narration/audio.
Source-level notes
syncRuntimeMedia() only seeks media when drift is greater than 0.5s and either:
- this is the first active tick,
- the offset jumped by more than 0.5s in one tick,
- or drift exceeds 3s.
Conceptually:
const drift = Math.abs(currentElTime - relTime);
const offset = relTime - currentElTime;
const prevOffset = lastOffset.get(el);
const firstTickOfClip = prevOffset === undefined;
const offsetJumped = !firstTickOfClip && Math.abs(offset - prevOffset) > 0.5;
const catastrophicDrift = drift > 3;
if (drift > 0.5 && (firstTickOfClip || offsetJumped || catastrophicDrift)) {
el.currentTime = relTime;
}
This means stable drift between 0.5s and 3s can remain uncorrected.
The tests also encode this behavior: one test expects 0.4s drift to remain uncorrected, and another expects audio to stay at 0 while the timeline advances during buffering. That avoids skipping opening audio, but for narration it means visual time can run ahead of audio.
The parent-proxy fallback path has a stricter synchronizer: MIRROR_DRIFT_THRESHOLD_SECONDS = 0.05 with consecutive-sample gating. However, that path only activates after iframe autoplay failure. Normal runtime-owned media still uses the looser 0.5s / 3s strategy.
Reproduction idea
Use a GSAP composition with a timed narration track:
<audio
id="narration"
data-start="0"
data-duration="10"
src="./narration.mp3"
></audio>
Then repeatedly toggle playback:
for (let i = 0; i < 40; i += 1) {
player.pause();
await new Promise((resolve) => setTimeout(resolve, 100));
player.play();
await new Promise((resolve) => setTimeout(resolve, 100));
}
After enough toggles, narration and animation/captions can become visibly or audibly offset.
Suggested direction
Short-term:
- Add a strict media sync mode for speech/narration use cases.
- Lower runtime-owned media drift correction from 500ms to ~50-80ms in strict mode.
- Use a consecutive-sample gate like the parent proxy path.
- Force media sync immediately on play, pause, seek, renderSeek, and playback-rate changes.
Long-term:
- Add a single transport clock for preview playback.
- When audible audio is active, use audio time as the master clock and seek GSAP from it.
- When no audio is active, use a monotonic performance clock.
- If audio buffers, stall visual time instead of allowing visuals to run ahead.
A WebAudio driver may be useful later, but it should probably be part of a single-clock transport design rather than a standalone replacement for <audio>.
Version
Observed on HyperFrames
main/@hyperframes/player@0.5.3.Relevant source:
packages/core/src/runtime/media.tspackages/core/src/runtime/player.tspackages/core/src/runtime/init.tspackages/player/src/hyperframes-player.tsSummary
Timed
<audio data-start>can drift out of sync with the GSAP visual timeline during interactive preview/player playback, especially after repeated pause/play or under degraded performance.This shows up clearly in speech-driven videos where narration, captions, and visual timing need to stay aligned.
Expected behavior
For narration, captions, and speech-driven videos, audio and visual timeline should stay within a tight A/V tolerance, roughly 50-80ms.
Repeated pause/play should not accumulate persistent drift.
Actual behavior
Runtime-owned audio can remain hundreds of milliseconds out of sync. The current runtime deliberately does not correct sub-500ms steady-state drift, and stable drift below 3s may persist.
This is easy to reproduce by repeatedly pausing and playing a composition with timed narration/audio.
Source-level notes
syncRuntimeMedia()only seeks media when drift is greater than 0.5s and either:Conceptually:
This means stable drift between 0.5s and 3s can remain uncorrected.
The tests also encode this behavior: one test expects 0.4s drift to remain uncorrected, and another expects audio to stay at 0 while the timeline advances during buffering. That avoids skipping opening audio, but for narration it means visual time can run ahead of audio.
The parent-proxy fallback path has a stricter synchronizer:
MIRROR_DRIFT_THRESHOLD_SECONDS = 0.05with consecutive-sample gating. However, that path only activates after iframe autoplay failure. Normal runtime-owned media still uses the looser 0.5s / 3s strategy.Reproduction idea
Use a GSAP composition with a timed narration track:
Then repeatedly toggle playback:
After enough toggles, narration and animation/captions can become visibly or audibly offset.
Suggested direction
Short-term:
Long-term:
A WebAudio driver may be useful later, but it should probably be part of a single-clock transport design rather than a standalone replacement for
<audio>.