Skip to content

Runtime-owned audio drifts from GSAP timeline after repeated pause/play #668

@markwitt1

Description

@markwitt1

Version

Observed on HyperFrames main / @hyperframes/player@0.5.3.

Relevant source:

  • packages/core/src/runtime/media.ts
  • packages/core/src/runtime/player.ts
  • packages/core/src/runtime/init.ts
  • packages/player/src/hyperframes-player.ts

Summary

Timed <audio data-start> can drift out of sync with the GSAP visual timeline during interactive preview/player playback, especially after repeated pause/play or under degraded performance.

This shows up clearly in speech-driven videos where narration, captions, and visual timing need to stay aligned.

Expected behavior

For narration, captions, and speech-driven videos, audio and visual timeline should stay within a tight A/V tolerance, roughly 50-80ms.

Repeated pause/play should not accumulate persistent drift.

Actual behavior

Runtime-owned audio can remain hundreds of milliseconds out of sync. The current runtime deliberately does not correct sub-500ms steady-state drift, and stable drift below 3s may persist.

This is easy to reproduce by repeatedly pausing and playing a composition with timed narration/audio.

Source-level notes

syncRuntimeMedia() only seeks media when drift is greater than 0.5s and either:

  • this is the first active tick,
  • the offset jumped by more than 0.5s in one tick,
  • or drift exceeds 3s.

Conceptually:

const drift = Math.abs(currentElTime - relTime);
const offset = relTime - currentElTime;
const prevOffset = lastOffset.get(el);
const firstTickOfClip = prevOffset === undefined;
const offsetJumped = !firstTickOfClip && Math.abs(offset - prevOffset) > 0.5;
const catastrophicDrift = drift > 3;

if (drift > 0.5 && (firstTickOfClip || offsetJumped || catastrophicDrift)) {
  el.currentTime = relTime;
}

This means stable drift between 0.5s and 3s can remain uncorrected.

The tests also encode this behavior: one test expects 0.4s drift to remain uncorrected, and another expects audio to stay at 0 while the timeline advances during buffering. That avoids skipping opening audio, but for narration it means visual time can run ahead of audio.

The parent-proxy fallback path has a stricter synchronizer: MIRROR_DRIFT_THRESHOLD_SECONDS = 0.05 with consecutive-sample gating. However, that path only activates after iframe autoplay failure. Normal runtime-owned media still uses the looser 0.5s / 3s strategy.

Reproduction idea

Use a GSAP composition with a timed narration track:

<audio
  id="narration"
  data-start="0"
  data-duration="10"
  src="./narration.mp3"
></audio>

Then repeatedly toggle playback:

for (let i = 0; i < 40; i += 1) {
  player.pause();
  await new Promise((resolve) => setTimeout(resolve, 100));
  player.play();
  await new Promise((resolve) => setTimeout(resolve, 100));
}

After enough toggles, narration and animation/captions can become visibly or audibly offset.

Suggested direction

Short-term:

  • Add a strict media sync mode for speech/narration use cases.
  • Lower runtime-owned media drift correction from 500ms to ~50-80ms in strict mode.
  • Use a consecutive-sample gate like the parent proxy path.
  • Force media sync immediately on play, pause, seek, renderSeek, and playback-rate changes.

Long-term:

  • Add a single transport clock for preview playback.
  • When audible audio is active, use audio time as the master clock and seek GSAP from it.
  • When no audio is active, use a monotonic performance clock.
  • If audio buffers, stall visual time instead of allowing visuals to run ahead.

A WebAudio driver may be useful later, but it should probably be part of a single-clock transport design rather than a standalone replacement for <audio>.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions