Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoEncoder knob for latency/realtime mode #269

Closed
chcunningham opened this issue Jun 4, 2021 · 11 comments
Closed

VideoEncoder knob for latency/realtime mode #269

chcunningham opened this issue Jun 4, 2021 · 11 comments
Labels
extension Interface changes that extend without breaking.

Comments

@chcunningham
Copy link
Collaborator

chcunningham commented Jun 4, 2021

We've acknowledge the need for this in a few issues, including #241 (comment) and #240. Here's a strawman proposal that ties into #103.

Interface extensions

  • Add a new latencyMode = "realtime" | "quality" (default) attribute to VideoEncoderConfig
    • Other bikeshed color: lowLatency = true | false
  • Add framerate (e.g. = 60) to VideoEncoderConfig.

Semantics:

  • realtime mode means:
    • try to give output chunks immediately without requiring additional input frames
    • where applicable, trade quality for latency (see libVPX and VideoToolbox options)
    • if framerate is provided, consider framerate as a deadline hint for producing output frames.
    • allow dropping of frames to meet encoderConfig.bitrate and framerate targets
  • quality mode means
    • fine to hold outputs while waiting for more inputs
    • optimize for quality
    • don't drop frames
    • framerate is not a deadline target.
  • In both modes:
    • use framerate as guide for rate control wrt bitrate
@Djuffin
Copy link
Contributor

Djuffin commented Jun 4, 2021

I think in all modes framerate should be a driving force for per frame bitrate allocation.
This corresponds to the behavior of most platform and software encoders.

@chcunningham
Copy link
Collaborator Author

I think in all modes framerate should be a driving force for per frame bitrate allocation.
This corresponds to the behavior of most platform and software encoders.

I agree. I've updated the opening comment to clarify

@chcunningham chcunningham added the extension Interface changes that extend without breaking. label Jun 5, 2021
@chcunningham
Copy link
Collaborator Author

Copying @Djuffin's relevant comment from #240

When in realtime mode, encoders might have to drop frames in order to meet the bandwidth or time budget, but it is not an error. They might indicate a dropped frame by providing an empty output (or maybe just no output at all).
Encoders are responsible that even if a frame is dropped, the resulting stream can still be decoded as if nothing happened. Following encoded frames should not depend on the dropped frame.

Those behaviors sgtm.

@aboba
Copy link
Collaborator

aboba commented Jun 11, 2021

I don't think that a desire for "real-time" necessarily indicates that "quality" is not desired.

In content-hints, we have hints for "motion", "detail" and "text", with "motion" meaning a preference for framerate, "detail" a preference for resolution, and "text" similar to "detail" but also potentially turning on screen coding tools.

Note also the definition of degradationPreference.

@chcunningham
Copy link
Collaborator Author

In content-hints, we have hints for "motion", "detail" and "text", with "motion" meaning a preference for framerate, "detail" a preference for resolution, and "text" similar to "detail" but also potentially turning on screen coding tools.

Note also the definition of degradationPreference.

Do you know to what extent these features are implemented across codecs under the hood? Particularly outside of the vpx library? My sense from scanning platform encoding APIs is they may not offer this rich set of options. But realtime vs quality seems pretty common.

@chcunningham
Copy link
Collaborator Author

Moving discussion from #242

@Djuffin said:

If an encoder drops a frame because it can't meet bitrate constraints, it is not an error. It is a completely normal situation.

@aboba said

That makes sense to me, but in Issue #240, it is noted that dropping if disallowed (even with a bitrate constraint??) would be considered an implementation bug. Seems to me like it shouldn't be permitted to simultaneously disallow frame drops and set a bitrate constraint.

The idea in this issue is that dropping is disallowed when mode = quality. In that mode, bitrate is not a constraint, but a target. We can alternatively break bitrate into bitrateMax (constraint) and bitrate (target), but I also like the proposal as-is.

@Djuffin
Copy link
Contributor

Djuffin commented Jun 11, 2021

Seems to me like it shouldn't be permitted to simultaneously disallow frame drops and set a bitrate constraint.

Many encoders (different platforms and APIs) allow CBR without frame drops. Sometimes the encoder just can't meet bitrate budget. I don't think that webcodecs need to be different from other APIs here.

@chcunningham
Copy link
Collaborator Author

Editors call:

We can alternatively break bitrate into bitrateMax (constraint) and bitrate (target), but I also like the proposal as-is.

@aboba does like the clarity of having distinct "max" attribute. Avoids confusion previously seen in webrtc.

@chcunningham following up w/ RTC implementers to understand how these attributes affect (passthrough?) the encoders or whether they're higher level.

@Frenk8

This comment has been minimized.

@chcunningham
Copy link
Collaborator Author

@chcunningham following up w/ RTC implementers to understand how these attributes affect (passthrough?) the encoders or whether they're higher level.

Discussed this with @ilyanikolaevskiy. In his own words:

Regarding content-hit. It's very webrtc specific and mostly affects webrtc stack above the encoders. It's supported all the way.
First, it configures webrtc to never downscale screenshare feeds. Then, it configures SW encoders to use a special temporal layers pattern, if scalability is required (for conferences). Only VP8 encoder supports scalability for screenshare, VP9 implementation is experimental and not launched anywhere. It also tunes some parameters like denoising and QP ranges and bitrate overshoot thresholds. It also enables some features like variable encode framerate (if QP is very low and the input is static, framerate is reduced greatly).

From this I take away that the knobs we are now proposing are sort of orthogonal to WebRTC's content-hint. It sounds like we can satisfy the requirements above using the scalabilityMode config + some additional knobs for noise and QP.

@chcunningham
Copy link
Collaborator Author

Editors call: the knobs proposed here sound good enough to send PRs. I will file separate issues for additional low level knobs shortly (min / max QP, motion vs detail hint, low level buffer dependency specification, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Interface changes that extend without breaking.
Projects
None yet
Development

No branches or pull requests

4 participants