Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamps and time domains #8

Closed
steveanton opened this issue Aug 28, 2019 · 4 comments
Closed

Timestamps and time domains #8

steveanton opened this issue Aug 28, 2019 · 4 comments
Labels
hard Hard problem that needs discussion

Comments

@steveanton
Copy link
Contributor

This is a tracking issue for describing how timestamps work with WebCodecs and integrate with the video and audio playout time domains.

@pthatcherg pthatcherg added the hard Hard problem that needs discussion label Sep 18, 2019
@padenot
Copy link
Collaborator

padenot commented Oct 17, 2019

In particular, when it comes to connecting WebCodec objects to MediaStream, there need to be provision for dealing with multiple clock domains. This is especially important for audio, but also quite critical for video.

@chcunningham
Copy link
Collaborator

@padenot can you give toy example for how the problem manifests?

@padenot
Copy link
Collaborator

padenot commented Feb 12, 2020

Here is a typical setup:

  • Record a microphone/webcam (those are MediaStreamTracks) using Web Codecs. This gives you a series of encoded packets with timestamps in the clock domain of the input device. In particular, there is going to be a very specific speed at which the frames are made available, that might not be the speed of the system clock. For example it's almost guaranteed that 1s second of audio captures at 44100Hz (i.e. 44100 frames) was captured in a bit less or more than 1s on the system clock: this is a first clock drift.
  • Decode it using WebCodecs, and play it out using another MediaStreamTrack. Depending on the output device (screen/speakers), it will probably be clocked differently than the input device. This means that, say, 30 frames captured at 30fps on the webcam is not going to be played out in exactly 1 second on a screen that is 60fps: the clock domain of the webcam and the screen are different, and also different compared to the system clock, a webcam frame will eventually be either dropped or displayed for 3 screen frames, because of this second drift.

For video, simply repeating the frame is the approach usually taken (not always). For audio, it is absolutely essential to do adaptive resampling, or else problems will appear: either latency buildups if the drift goes in one direction, or gaps (translating into very problematic audio glitches) if the drift goes in the other direction.

@chcunningham
Copy link
Collaborator

chcunningham commented May 12, 2021

Most of the concern about playout via MediaStreamTrack is now out of scope for WebCodecs, and might instead go to https://github.com/w3c/mediacapture-transform/issues. Issue w3c/mediacapture-transform#35 might be a good a starting point.

For sync generally, I think resolution here is the same as discussion in #39 (comment). In short, apps can overcome skew concerns by letting audio drive the clock (as a function of its output samples and sample rate), and then paint video frames to match wherever audio is at.

Closing as I don't think there's anything else in scope for WC here, but please reopen if I've overlooked anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hard Hard problem that needs discussion
Projects
None yet
Development

No branches or pull requests

4 participants