Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's hard to set size(width/height) information in VideoDecoderConfig #94

Closed
ytakio opened this issue Oct 19, 2020 · 21 comments · Fixed by #186
Closed

It's hard to set size(width/height) information in VideoDecoderConfig #94

ytakio opened this issue Oct 19, 2020 · 21 comments · Fixed by #186
Labels
breaking Interface changes that would break current usage (producing errors or undesired behavior).

Comments

@ytakio
Copy link

ytakio commented Oct 19, 2020

(I'm sorry if this issue is duplicated one.)
According to Spec, VideoDecoder requires size(codec and crop) information to configure decoder.
But I think that it's hard to set without analyzing raw stream when I received only raw stream.

Like H.26x codec includes coded size information in SPS as RBSP. So decoder may do without passing size information.

VideoFrame already has information to render appropriately. However I would be happy if there is some notification callback to get size information before decoding a first frame.

@sandersdan
Copy link
Contributor

Take a look at #26 for reasoning behind the current design.

In general we cannot know what the size will be before a frame is decoded, typically this information is provided by the media container instead.

In a prototype we had fields for aspect ratio instead of size, but we thought this was more difficult to explain. I'll leave this open to discuss that choice.

@ytakio
Copy link
Author

ytakio commented Oct 22, 2020

Thank you for information of current design.
Yes, some codecs don't include self size information, so container take in charge from them.

On the other hands, MPEG2-System (like MPEG2-TS/PS Format) doesn't include this kind of information(width/height/PAR/DAR).
(I don't want to take care of MPEG2-TS, though...)

AS #26 (comment) points out, For VideoDecoder.configure() size information in VideoDecoderConfig seems just a hint initialize.
In spec, if we set invalid parameter of size, VideoDeocoder.configure() will fail and send exception even though it would be just hint.

So I'd appreciate it if WebCodecs would consider that let me set codedWidth and codedHeight zero as unknown size.

To check if a VideoDecoderConfig is a valid VideoDecoderConfig, run these steps:
If codec is not a valid codec string, return false.
If codedWidth = 0 or codedHeight = 0, return false.
If cropWidth = 0 or cropHeight = 0, return false.
If cropTop + cropHeight >= codedHeight, return false.
If cropLeft + cropWidth >= codedWidth, return false.
If displayWidth = 0 or displayHeight = 0, return false.
Return true.

(Aspect ratio is complicated one for some not experts, so VideoFrame.displey(Width|Height) presenting looks like good design.)

@ytakio
Copy link
Author

ytakio commented Oct 23, 2020

My apologies for bothering you repeatedly.

I'm not good at JavaScript (and English😅), so following description might seem oddly.

So I'd appreciate it if WebCodecs would consider that let me set codedWidth and codedHeight zero as unknown size.

It means that how about does WebCodecs allow us to call VideoDecoder.configure() without setting VideoDecoderConfig.(codedWidth| codedHeight) ?

@sandersdan
Copy link
Contributor

Your request is understood, and I believe it is possible.

@ytakio
Copy link
Author

ytakio commented Feb 17, 2021

I'm sorry for bothering agin.
Are there still needs to keep require attribute for codedWidth/codedHight in VideoDecoderConfig, yet?

(Currently, I've been able to configure without codedWidth/codedHight, though)
I'd appreciate it if you would confirm it.

@sandersdan
Copy link
Contributor

sandersdan commented Feb 17, 2021

@chcunningham: Last time this was discussed this there wasn't a reason to mark these as required, and Chrome's implementation without required has been working.

Chrome's implementation doesn't handle any of this correctly, I filed crbug.com/1179101.

@chcunningham chcunningham added this to the 2021 Q1 milestone Mar 16, 2021
@chcunningham
Copy link
Collaborator

@sandersdan if it's only function is to hint at what type of internal decoder (hardware vs software) should be created, perhaps we should also drop the crop and display sizes?

chcunningham added a commit that referenced this issue Apr 21, 2021
For VideoDecoderConfig, the coded size attributes are no longer
required. They only serve as a hint for the UA to create an approriate
underlying decoder (e.g. hardware accelerated for higher resolutions)
and are not strictly required. Fixes #94.

For VideoEncoderConfig, cropWidth and cropHeight are renamed to simply
be width and height. The crop prefix was added in #113 as I realized
that we were describing the "crop" size of the encoded VideoFrames. But
I now think this risks giving the impression that the encoder is going
to crop frames. Being intuitive is more important than being pedantic.
@sandersdan
Copy link
Contributor

The crop and display sizes are already optional, but they are used to compute the pixel aspect ratio to apply to decoded frames. I didn't find a satisfactory way to use them without also providing a coded size though, so I'm not sure where to go from here. Some options:

  • Without coded size, crop is arbitrary and only has meaning in relation to display size.
  • Without coded size, crop must not be provided, and display size is treated as a display aspect ratio.
  • Remove crop, display size is always a display aspect ratio. (Note: not very compatible with streams that change display aspect ratio in-band, but workable if the app explicitly reconfigures at those boundaries.)

@chcunningham
Copy link
Collaborator

Re-opening to track resolution of the above comment

@chcunningham chcunningham reopened this Apr 23, 2021
@ytakio
Copy link
Author

ytakio commented Apr 26, 2021

Without coded size, crop must not be provided, and display size is treated as a display aspect ratio.

+1

As a codec side perspective, resolution and sample aspect ratio settings are only for encoders, not for decoders in general. A decoder just reads such information from bitstream. Important thing in here ratio doesn't mean display size.
(SAR must be same in a GOP for bitstream spec, but sequence can be reconfigured with sequence header of each codec spec at each boundaries.)

In regards to crop settings, it doesn't need for encoder and decoder, I think. I mean that video codec(both of encoder and decoder) doesn't accept crop information as settings. They just pickup or draw pixel in accordance with a frame format.

So IMO, here is the thing...

dictionary VideoEncoderConfig {
  required DOMString codec;
  unsigned long long bitrate;
  required unsigned long width;  // it is width of resolution
  required unsigned long height;
  unsigned long sar_width;    // or dar_width (display aspect ratio)
  unsigned long sar_height;   // or dar_height
  HardwareAcceleration hardwareAcceleration = "allow";
};
dictionary VideoDecoderConfig {
  required DOMString codec;
  BufferSource description;
  unsigned long sar_width;    // or dar_width. It'll overwrite bitstream settings if set
  unsigned long sar_height;   // or dar_height
  HardwareAcceleration hardwareAcceleration = "allow";
};
[Exposed=(Window,DedicatedWorker)]
interface VideoFrame {
  ...
  readonly attribute PixelFormat format;
  readonly attribute FrozenArray<Plane> planes;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute unsigned long cropLeft;
  readonly attribute unsigned long cropTop;
  readonly attribute unsigned long cropWidth;
  readonly attribute unsigned long cropHeight;
  readonly attribute unsigned long displayWidth;    // it'll be ignore at encoding
  readonly attribute unsigned long displayHeight;
  readonly attribute unsigned long long? duration;
  readonly attribute unsigned long long? timestamp;
  ...
};

I would appreciate it if you would confirm.

@sandersdan
Copy link
Contributor

sandersdan commented Apr 26, 2021

The IDL you provided seems logical to me, but the description doesn't seem quite right. Decoders shouldn't care about aspect ratio in general, but they do produce VideoFrames which contain that information. Depending on the codec the aspect ratio information may be in the bitstream or in the container (or both), so a generic decode API does need to include it.

(Note: Chrome's <video> implementation prefers container aspect ratio metadata over in-band for more consistent cross-platform behavior. This matches your description of sar replacing bitstream metadata.)

SAR is the most technically correct way to provide the information to the decoder, but it's usually quite inconvenient to work with (eg. MP4 metadata [excluding optional PASP] includes only a DAR, and conversions are lossy). Intended display size is at least usually integer values.

I don't know if there is a silver bullet here, but I think it's down to two choices:

  1. Crop is optional. When a crop size is provided, use it to compute a PAR from the display size, otherwise just use display size as a DAR. If everything is missing, assume 1:1 PAR.
  2. Deprecate crop. Always treat display size as a DAR, or use 1:1 PAR if display size is missing.

I'm leaning towards (2) because we can always add a PAR path later if there is demand for it, in a backwards-compatible way. This would mean that apps would need to reconfigure at places where the DAR changes, but it's likely they would be doing that anyway.

@ytakio
Copy link
Author

ytakio commented Apr 27, 2021

Thank you for your confirmation.

Decoders shouldn't care about aspect ratio in general, but they do produce VideoFrames which contain that information. Depending on the codec the aspect ratio information may be in the bitstream or in the container (or both), so a generic decode API does need to include it.

Totally agree. SAR/DAR information are just a metadata for codec.
But I hope that WebCodecs parses resolutions and aspect ratio information from bitstream if they include. Because it's a little bit redundant for WebApp to parse header and want to keep common interfaces for every codec.

Deprecate crop. Always treat display size as a DAR, or use 1:1 PAR if display size is missing.

+1. So I wanted to ask that is it possible to consider to revise like in the following?

  • Deprecate crop. Always treat display size as a DAR, or use PAR in bitstream if display size is missing and bitstream includes PAR, or use 1:1 PAR if any aspect ratio information is missing.

(Some webcam may output raw bitstream which include PAR in VUI, that's why I hope above)

@sandersdan
Copy link
Contributor

Deprecate crop. Always treat display size as a DAR, or use PAR in bitstream if display size is missing and bitstream includes PAR, or use 1:1 PAR if any aspect ratio information is missing.

If we did this, it would likely be on a best-effort basis (allowed but not mandated by the spec). My reasoning is that not all decoders reliably expose this information, so a given implementation may not be able to do this correctly in every case.

Usually I prefer solutions that result in reliable cross-platform behavior, but I'm interested to hear if you prefer best-effort here.

@ytakio
Copy link
Author

ytakio commented Apr 28, 2021

Thank you. I'm sorry that I overlooked the point regarding standardization.
It's totally agree that you concern about.

It reminds me that some other person told that there are no way to get some metadata from bitstream (like SPS, VUI, SEI in H.26x) with WebCodecs.

So, I prefer original (2) in #94 (comment) as you describe, too. On the other hand, I think WebCodecs needs a method to get Metadata (including feature check method, as informative in spec?)
It may be very useful even if WebCodecs just pass a start point of each Metadata byestream (like SPS,VUI,SEI...) to WebAPPs.

Anyway, thank you so match for hearing my voice.

@chcunningham
Copy link
Collaborator

I think the above points are persuasive (great discussion). I support option (2) in #94 (comment)

On the other hand, I think WebCodecs needs a method to get Metadata (including feature check method, as informative in spec?) It may be very useful even if WebCodecs just pass a start point of each Metadata byestream (like SPS,VUI,SEI...) to WebAPPs.

I've filed #198 to track this request

@chcunningham
Copy link
Collaborator

Triage note: marked 'breaking', as the outcome of the above discussion is to deprecate crop sizing from the VideoDecoderConfig. In practice this shouldn't actually break anyone, as Chrome's impl doesn't currently use those sizes (other than to validate them), so they're ignored either way.

@chcunningham chcunningham added the breaking Interface changes that would break current usage (producing errors or undesired behavior). label May 12, 2021
@chcunningham
Copy link
Collaborator

In reviewing the PR I realized I didn't completely understand the resolution above. I thought config.displayWidth|Height was a passthrough to set frame.displayWidht|Height. But @sandersdan clarified his intent was that this actually provide an aspect ratio setting, which would have the effect of increasing one of frame.displayWidth or frame.displayHeight as needed to produce the describe aspect ratio as described in html.

Beyond the html historical precedent, using scaling in this way has the nice property that it allows for inband resoution changes to occur (assuming the new resolution uses the same DAR).

To help users avoid the confusion I was having, we propose renaming config.displayWidth|Height -> config.displayAspectWidth|Height.

@ytakio
Copy link
Author

ytakio commented May 21, 2021

To help users avoid the confusion I was having, we propose renaming config.displayWidth|Height -> config.displayAspectWidth|Height.

It looks clear for me.

I have another idea. This Issue figured out that a aspect ratio is just a meta data for rendering.
I think, it is depended on a way how WebApp render decoded frame. So there may be another way that VideoFrame provide a method to set displayAspectWidh|height.

This is my just idea.
So, of course, your proposal is no problems at all, I think.

@sandersdan
Copy link
Contributor

So there may be another way that VideoFrame provide a method to set displayAspectWidh|height.

I think this is the same as #161?

@ytakio
Copy link
Author

ytakio commented May 22, 2021

I think this is the same as #161?

Wow, it's great and perfect! 🎉
Thanks to above, I think that WebApp can render arbitrarily and keep aspect ratio to every surface which takes CanvasImageSource.
(🤔 Ah... videoAspectWidth|Height are not needed anymore, maybe? I'm not sure, though Just thought, Please forget it.)

Thanks for your telling 😄

@tguilbert-google
Copy link
Member

The spec has been updated to remove the crop attributes, in #250

Please re-open if necessary!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Interface changes that would break current usage (producing errors or undesired behavior).
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants