Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipped Frames at the beginning of file with some decoders #113

Closed
mkver opened this issue Dec 8, 2016 · 6 comments
Closed

Skipped Frames at the beginning of file with some decoders #113

mkver opened this issue Dec 8, 2016 · 6 comments
Assignees

Comments

@mkver
Copy link

mkver commented Dec 8, 2016

Hello,

I have an issue with some missing frames at the start of a file (not at the very beginning, but close). More specifically, frames 25-27 (frame count starts at zero) of the file "Skipped.Frames.mkv" are skipped when I use software decoding or DXVA (native and copy-back) decoding, but not with Intel Quick Sync. I tested this both with GraphStudioNext and the latest release (0.68.1) and with MPC-HC and its internal filters (based upon 0.66). They are even skipped when I fast-forward frame-by-frame and I cannot seek to these frames. Microsoft's DTV-DVD decoder plays it just fine (at least in hardware; I am unable to test whether it would play it in software, too, because it automatically uses hardware).
The missing frames are b-frames shared between two GOPs (the latter of which is open). Here is an extract from mkvinfo:

| + SimpleBlock (track number 1, 1 frame(s), timecode 0.400s = 00:00:00.400)
| + Frame with size 6081
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.420s = 00:00:00.420)
| + Frame with size 6079
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.440s = 00:00:00.440)
| + Frame with size 5935
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.460s = 00:00:00.460)
| + Frame with size 5923
|+ Cluster
| + Cluster timecode: 0.500s
| + SimpleBlock (key, track number 1, 1 frame(s), timecode 0.640s = 00:00:00.640)
| + Frame with size 48044
| + SimpleBlock (discardable, track number 1, 1 frame(s), timecode 0.560s = 00:00:00.560)
| + Frame with size 10815
| + SimpleBlock (discardable, track number 1, 1 frame(s), timecode 0.500s = 00:00:00.500)
| + Frame with size 3662
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.520s = 00:00:00.520)
| + Frame with size 5051
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.540s = 00:00:00.540)
| + Frame with size 5048
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.580s = 00:00:00.580)
| + Frame with size 1981
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.600s = 00:00:00.600)
| + Frame with size 5238
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.620s = 00:00:00.620)
| + Frame with size 5238
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.760s = 00:00:00.760)
| + Frame with size 35275
| + SimpleBlock (discardable, track number 1, 1 frame(s), timecode 0.660s = 00:00:00.660)
| + Frame with size 988
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.680s = 00:00:00.680)
| + Frame with size 4837
| + SimpleBlock (track number 1, 1 frame(s), timecode 0.700s = 00:00:00.700)
| + Frame with size 4829

And here is an extract from h264_parse. It shows that mkvmerge (which I used to create these mkv's) has made no error in the determining the decoding and presentation orders. The first I-frame is the one with timecode 0.64 above:

Nal length 12 start code 4 bytes
ref 0 type 6 SEI
payload_type: 6 recovery_point
payload_size: 1 0x84
recovery_frame_cnt: 0
exact_match_flag: 0
broken_link_flag: 0
changing_slice_group_idc: 0
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 47889 start code 4 bytes
ref 2 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 7 (I)
pic_parameter_set_id: 0
frame_num: 28 (5 bits)
pic_order_cnt_lsb: 32
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 10806 start code 4 bytes
ref 2 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 29 (5 bits)
pic_order_cnt_lsb: 24
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 3652 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 18
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 5042 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 20
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 5039 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 22
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 1972 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 26
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 5229 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 28
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 5229 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 30
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 35266 start code 4 bytes
ref 2 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 5 (P)
pic_parameter_set_id: 0
frame_num: 30 (5 bits)
pic_order_cnt_lsb: 44
Nal is new picture
Nal length 9 start code 4 bytes
ref 0 type 6 SEI
payload_type: 1 pic_timing
payload_size: 1 0x4
pict_struct: 0
clock_timestamp_flag[0]: 0
Nal length 979 start code 4 bytes
ref 0 type 1 Coded slice of non-IDR picture
first_mb_in_slice: 0
slice_type: 6 (B)
pic_parameter_set_id: 0
frame_num: 31 (5 bits)
pic_order_cnt_lsb: 34
Nal is new picture

As one can see, this is a pretty long B-chain; and inside this chain, the decode and presentation orderings do not coincide. Is this even allowed? (This is an unreencoded satellite recording so it should be.)
There is also another strange thing: If I do not cut frame-accurately at the beginning, but include a little bit more, these frames are not skipped (with either decoder). You can see this behaviour in "No.Skipped.Frames.mkv" in this zip-file.
I use Win 7 64bit and have an Intel i3 4005U. But at least the part with software-decoding should be hardware-independent so I hope you can reproduce it.
And thanks for the huge amount of time you have invested in these filters.

@Nevcairiel
Copy link
Owner

This is a typical issue in broadcast streams because the number of B-frame reordering isn't known right from the start, so to minimize decode latency (which is especially relevant for Live TV) LAV "guesses", unfortunately if the guess is too low it has to drop a frame when it corrects itself.

The guessing goes especially wrong if different GOP structures are mixed, but the stream is otherwise perfectly valid.

I adjusted the guessing logic a bit to always consider at least two out-of-order frames, which seems most common in real-world broadcast streams, and this avoids the problem with this particular sample, but there are no guarantees that it won't come back later with others.

@Nevcairiel Nevcairiel self-assigned this Dec 8, 2016
Nevcairiel added a commit that referenced this issue Dec 8, 2016
This avoids dropping frames in the most common broadcast streams while
still keeping decode latency low.

Fixes GitHub issue #113
@mkver
Copy link
Author

mkver commented Dec 8, 2016

That was incredibly fast! Thanks.
But allow me to ask some questions:

  1. What is the reason that there was no problem with the longer clip? Does the decoder already increase this value on its own and has this adaption failed in the short clip because it didn't have enough time/samples to analyze?
  2. Does this fix the problem with DXVA2, too, or just the software decoder?
  3. Does this pose A/V sync issues because of the latency? Or is it just so that startup time will be a little bit longer because the decoder has to decode one more frame before it starts giving these frames to the renderer?
  4. The latency you are talking about is something different than the renderer-buffer, isn't it? (I always thought that the renderer buffer serves exactly this purpose: To give the decoder time to sort any issues out.)
  5. Modern computers (in particular modern hardware decoders) are powerful enough to decode such streams faster than realtime. So why do these frames need to be totally dropped? Wouldn't it be feasible to catch up by decoding faster than realtime?
  6. I always assumed that the number of reference frames is the number of frames that the decoder has to cache in its DPB in order to be able to play a stream, i.e. it is an upper bound for the number that you are trying to guess. Is that even true?

@Nevcairiel
Copy link
Owner

  1. No clue why the longer clip is different. Its possible that more data causes the guessing logic to work differently.
  2. Both software and dxva2
  3. No, AV sync is unaffected. The only real difference is when decoding live sources, as you can't just read faster then realtime on those, so it takes 1 frame longer for output to appear. File decoding should not notice whatsoever.
  4. Yes, this is independent of the renderer. The decoder has to re-order frames because B-frames are encoded out of order to be able to use future frames as references (have to decode the future references first)
  5. This has nothing to do with speed. When re-ordering frames you have to know how long to hang on to a frame just in case another frame is coming that is supposed to go before this frame in presentation order - and if this value is too low, you would already have output the frame, and not dropping the new frame would result in frames being out of order, which would be far worse then dropping them.
  6. This is the theoretical maximum, but like I tried to mention in the earlier post it tries to optimize this to reduce the latency between input and output.

@mkver
Copy link
Author

mkver commented Dec 8, 2016

  1. But doesn't the decoder know from pic_order_cnt_lsb that there it shouldn't output the frame at 0.56 directly after the frame from 0.48? Or does it ignore it in order to improve the handling of defect streams?
  2. If latency is only an issue for live sources, then can't the splitter tell the decoder whether this is live or a file. The optimization could then be restricted to live sources and the ref-value can be used for all the other stuff. If I am not mistaken, this won't increase RAM usage, does it? After all your guessing strategy does not allow you to store less frames in the DPB.
    And thanks for your valuable answer.

@Nevcairiel
Copy link
Owner

The POC is not reliable for this, if there is a hole in the POCs you don't know if any future frame is going to fill it, or if its just going to remain open - there are no guarantees that every single "slot" is used.

Memory use can be impacted as these re-ordered frames are not necessarily reference frames, although the memory aspect is not something i'm very concerned with. I have been considering just stickting to standards-compliant behavior and buffer more frames, but in reality the "guessing" based on measuring the delay in the first GOP works fine in 99% of all cases.

@mkver
Copy link
Author

mkver commented Dec 20, 2016

Thanks for everything again (including your patient explanations). I just checked the file with release 0.69 and it really works. Job well done!

dwbuiten referenced this issue in FFMS/ffms2 May 30, 2018
…ed as DISCARD

They do not count towards the codec delay, and are used to signal, for example,
edit list edits.

Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants