Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTCRtpEncodingParameters: scaleResolutionTo #159

Open
henbos opened this issue Apr 13, 2023 · 8 comments
Open

RTCRtpEncodingParameters: scaleResolutionTo #159

henbos opened this issue Apr 13, 2023 · 8 comments

Comments

@henbos
Copy link
Collaborator

henbos commented Apr 13, 2023

We all know and love scaleResolutionDownBy... it let's you do this:

Capture 720p and apply expensive video effects on the track.
Send {active, scaleResolutionDownBy:1 (720p)} + {active, scaleResolutionDownBy:2 (360p)} simulcast.

But then server tells us 720p is not needed, so we send {inactive, active:360p}.
Now you might wonder, why are we applying expensive video effects on a 720p track if we're only sending it in 360p?

So we track.applyConstraints() to 1/2 downscale the track and we setParameters() to 2x the scaling factors to counter the fact that the input frame size is smaller. The end result is:

360p track sent as {inactive} + {active, scaleResolutionDownBy:1 (still 360p)}.

Problem: This is racy. The track changing size and the parameters updating the scaling factors are not in-sync, so you might send 720p on the VGA layers for a few frames. You're wasting a key frame on sending the wrong resolution and then you're wasting another key frame to adjust back to the desired 360p resolution. Working around this by temporarily inactivating all encodings while the track is resizing will appear like glitches to the receiver of the stream and there probably would still have to be key frames when the encodings are re-activated. The API was not designed for this.

Proposal: Add "scaleResolutionTo: 360" API which is equivalent to always having the correct scaling factor. In the above example, changing the track's size from 720p to 360p would not result in reconfiguring the encoder and there would be no key frames or races.

@henbos
Copy link
Collaborator Author

henbos commented Apr 13, 2023

Something like this already exists in libwebrtc (RtpEncodingParameters::requested_resolution) so it might be fairly straight-forward to experiment with this in JavaScript

@henbos
Copy link
Collaborator Author

henbos commented Apr 13, 2023

@aboba

@fippo
Copy link
Contributor

fippo commented Apr 13, 2023

scaling resolution down by a factor (to make things worse: a freely choosable one instead of powers of two) has always been confusing so 👍 for fixing this.

@Orphis
Copy link
Contributor

Orphis commented Apr 13, 2023

We may not want to have a single value, but a few more to describe how the change is done. While some resolutions are standard and we usually can recognise the usual ones from one dimension, we need both to work with both portrait and landscape tracks.

Quite often, the ratio will be different, so the behaviour needs to be defined there too. Do we want to rescale? Keep the ratio but crop? Or add black bars? Or fit to the height or width parameter and crop the outside parts?
If we're doing this, do we want crop coordinates too in the API? (for example to frame a participants head)

@Orphis
Copy link
Contributor

Orphis commented Apr 13, 2023

Also, what happens if the track is smaller than the requested resolution? Should the aspect ratio be changed to match the requested resolution's ratio? So is it a min value, max value or target value?

@henbos
Copy link
Collaborator Author

henbos commented Apr 13, 2023

Landscape vs portrait mode has to be addressed. There are a few different options I can think of:

  • If we use a single pixel value (e.g. 360 for 360p) then it could be defined as "height if height < width, otherwise width".
  • If a {width, height} dictionary is specified, we could crop-and-scale according to some rule.
  • Or we add more attributes to increase specificity from the app, perhaps something like VideoResizeModeEnum.

Regarding if the track is smaller than requested: I don't think we should ever up-scale, so I only see two options:

  • Either we send it as-is with scale factor 1.
  • Or we implicitly inactive that encoding.

I don't have strong opinions, just listing options.

@dontcallmedom-bot
Copy link

This issue had an associated resolution in WebRTC May 2023 meeting – 16 May 2023 (Issue #159: RTCRtpEncodingParameters: scaleResolutionDownTo):

RESOLUTION: Overall support in the direction, but details need to be fleshed out and complexity assessed

@aigarrz
Copy link

aigarrz commented Feb 19, 2024

What should happen when video frame adapter reduces input resolution to below requested size?

Example:

  • Requested to send two stream simulcast: 720p @ 1M + 540p @ 1.2M
  • Initially input frame size is 720p
  • EncoderOveruse triggers down adaptation and reduces input frame size to 540p
  • Which of these layouts should be sent to peer?
    [A] 540p @ 1M + 540p @ 1.2M
    [B] 540p @ 1.2M
    [C] 540p @ 1M

If the peer is a SFU, only option C avoids bitrate overshooting E2E.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants