This repository has been archived by the owner. It is now read-only.

non-monotonous timestamps when muxing certain AVC/H.264 file #2028

Closed
sneaker2 opened this Issue Jul 1, 2017 · 19 comments

Comments

3 participants
@sneaker2

sneaker2 commented Jul 1, 2017

Hi,

I have a sample taken from Blu-ray remuxed to mkv using mkvmerge. When playing back in MPC-HC/LAV + EVR or ffplay there is a short pause at about 2 ~ 3 seconds in. When remuxing using ffmpeg -i "output.mkv" -c copy "remux.mkv" ffmpeg throws a lot of errors:

 3086, current: 2461; changing to 3086. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2544; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2502; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2628; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2586; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2711; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2669; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2794; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2753; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2878; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2836; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 3003; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3170, current: 2961; changing to 3170. This may result in incorrect timestamps
in the output file.
[matroska @ 00000000005a28c0] Non-monotonous DTS in output stream 0:0; previous:
 3212, current: 3128; changing to 3212. This may result in incorrect timestamps
in the output file.

File was created using mkvmerge 13.0 and a simple mkvmerge -o "output.mkv" "input.264".
Sample uploaded to ftp.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 1, 2017

Owner

Thanks. I'll look into it, but probably not soon.

Owner

mbunkus commented Jul 1, 2017

Thanks. I'll look into it, but probably not soon.

@mkver

This comment has been minimized.

Show comment
Hide comment
@mkver

mkver Jul 2, 2017

I don't have the sample, but I have a guess: That this is an issue in ffmpeg. Everything that follows is based upon my assumption that you have the same issue that I sometimes have.

If the video stream does not signal the number of reorder frames, then any player has to guess it (it can be signalled in the bitstream in the VUI parameters max_num_reorder_frames (and x264 does it according to my checks)). If that guess is too low, then some frames are skipped and you have a short hiccup (this might depend on the method your player uses for keeping audio and video in sync). I once asked Nevcairiel about this here and also uploaded a sample. My sample is a file which uses open-gops in general, but closed gops on scenechanges (both with recovery point I frames, not IDR frames, but I don't think that this is important). It has been produced by MKVToolNix, but I have tested it with other samples which I have remuxed from a transport-stream with ffmpeg. When ffmpeg encounters the second GOP it notices that its guess was too low and emits an "Increasing reorder buffer to 2" (in my case) warning when not using -c copy. Some frames are skipped at this point.
The number of reorder frames is important for the difference between pts and dts. My sample looks like this (edited to remove unnecessary stuff):

I frame, track 1, timecode 0 (00:00:00.000), size 39825
...
P frame, track 1, timecode 480 (00:00:00.480), size 45102
B frame, track 1, timecode 380 (00:00:00.380), size 1602
P frame, track 1, timecode 400 (00:00:00.400), size 6081
P frame, track 1, timecode 420 (00:00:00.420), size 6079
P frame, track 1, timecode 440 (00:00:00.440), size 5935
P frame, track 1, timecode 460 (00:00:00.460), size 5923
Second cluster:
I frame, track 1, timecode 640 (00:00:00.640), size 48044
B frame, track 1, timecode 560 (00:00:00.560), size 10815
B frame, track 1, timecode 500 (00:00:00.500), size 3662
P frame, track 1, timecode 520 (00:00:00.520), size 5051
P frame, track 1, timecode 540 (00:00:00.540), size 5048
P frame, track 1, timecode 580 (00:00:00.580), size 1981

The first cluster has 25 frames. The highest timecode of these frames is 480ms. So the frames with timecodes 500, 520, 540 are 27-29 in decoding order, but 25-27 in presentation order (both orders zero-based). This is incompatible with using one reorder frame. These frames are the ones that have been skipped; the ones that I told Nevcairiel about. If one uses ffmpeg's copy-mode, one gets a different kind of error: The very same frames as before trigger the "Non-monotonous DTS"-error. But as Nevcairiel explained: The error actually happens before it reaches these frames, because ffmpeg's guess implies that there is a hole in the stream (no frames with timecodes >480ms and <560ms). If one makes a stream copy of Skipped.frames.mkv and uses -debug_ts, one can see it in the output:

demuxer -> ist_index:0 type:video next_dts:480000 next_dts_time:0.48 next_pts:480000 next_pts_time:0.48 pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 off:0 off_time:0
muxer <- type:video pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 size:48044
[matroska @ 00000000030d9960] Starting new cluster at offset 266929 bytes, pts 640dts 480
[matroska @ 00000000030d9960] Writing block at offset 10, size 48044, pts 640, dts 480, duration 20, keyframe 1
[NULL @ 000000000039a180] ct_type:0 pic_struct:0
demuxer -> ist_index:0 type:video next_dts:500000 next_dts_time:0.5 next_pts:500000 next_pts_time:0.5 pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 off:0 off_time:0
muxer <- type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 size:10815
[matroska @ 00000000030d9960] Writing block at offset 48062, size 10815, pts 560, dts 560, duration 20, keyframe 0
[NULL @ 000000000039a180] ct_type:0 pic_struct:0
demuxer -> ist_index:0 type:video next_dts:580000 next_dts_time:0.58 next_pts:580000 next_pts_time:0.58 pkt_pts:500 pkt_pts_time:0.5 pkt_dts:500 pkt_dts_time:0.5 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:500 pkt_pts_time:0.5 pkt_dts:500 pkt_dts_time:0.5 off:0 off_time:0
[matroska @ 00000000030d9960] Non-monotonous DTS in output stream 0:0; previous: 560, current: 500; changing to 560. This may result in incorrect timestamps in the output file.
muxer <- type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 size:3662

The error happens during demuxing the frame with pts 560ms, where pkt_dts jumps by 80ms. In the resulting file, the aforementioned frames 25-27 have pts 560ms. So it is not a real copy. This also happens if the original file has been remuxed from a transport stream to mkv with ffmpeg (this shows that this is not a bug in ffmpeg).

But now enough with ffmpeg. If a stream doesn't contain the reorder-parameter, then couldn't a muxer like MKVToolNix find it out during muxing, determine the right value and write them in the appropriate field in the header (after all, that's what the MinCache and MaxCache-values are there, aren't they?). That's the only thing a muxer can do if I am not mistaken. (A muxer cann't just add the max_num_reorder_frames-information in the CodecPrivate: Besides the problem that TV recordings and Bluray-remuxes repeat the SPS and one can't update them all in one pass because one doesn't know the right value at the beginning there is a second problem: If one includes this value, one needs also to include some other values like log2_max_mv_length_horizontal (concerning the horizontal length of motion vectors) -- information that a muxer simply doesn't know.) The demuxer/decoder needs to be updated to use this information, too. But then this would solve this problem.

Grüße
Andi

PS: By the way: Where is the reference pseudo-cache system the specs mention explained?
PPS: Feel free to file a bug report for ffmpeg based upon the above.

mkver commented Jul 2, 2017

I don't have the sample, but I have a guess: That this is an issue in ffmpeg. Everything that follows is based upon my assumption that you have the same issue that I sometimes have.

If the video stream does not signal the number of reorder frames, then any player has to guess it (it can be signalled in the bitstream in the VUI parameters max_num_reorder_frames (and x264 does it according to my checks)). If that guess is too low, then some frames are skipped and you have a short hiccup (this might depend on the method your player uses for keeping audio and video in sync). I once asked Nevcairiel about this here and also uploaded a sample. My sample is a file which uses open-gops in general, but closed gops on scenechanges (both with recovery point I frames, not IDR frames, but I don't think that this is important). It has been produced by MKVToolNix, but I have tested it with other samples which I have remuxed from a transport-stream with ffmpeg. When ffmpeg encounters the second GOP it notices that its guess was too low and emits an "Increasing reorder buffer to 2" (in my case) warning when not using -c copy. Some frames are skipped at this point.
The number of reorder frames is important for the difference between pts and dts. My sample looks like this (edited to remove unnecessary stuff):

I frame, track 1, timecode 0 (00:00:00.000), size 39825
...
P frame, track 1, timecode 480 (00:00:00.480), size 45102
B frame, track 1, timecode 380 (00:00:00.380), size 1602
P frame, track 1, timecode 400 (00:00:00.400), size 6081
P frame, track 1, timecode 420 (00:00:00.420), size 6079
P frame, track 1, timecode 440 (00:00:00.440), size 5935
P frame, track 1, timecode 460 (00:00:00.460), size 5923
Second cluster:
I frame, track 1, timecode 640 (00:00:00.640), size 48044
B frame, track 1, timecode 560 (00:00:00.560), size 10815
B frame, track 1, timecode 500 (00:00:00.500), size 3662
P frame, track 1, timecode 520 (00:00:00.520), size 5051
P frame, track 1, timecode 540 (00:00:00.540), size 5048
P frame, track 1, timecode 580 (00:00:00.580), size 1981

The first cluster has 25 frames. The highest timecode of these frames is 480ms. So the frames with timecodes 500, 520, 540 are 27-29 in decoding order, but 25-27 in presentation order (both orders zero-based). This is incompatible with using one reorder frame. These frames are the ones that have been skipped; the ones that I told Nevcairiel about. If one uses ffmpeg's copy-mode, one gets a different kind of error: The very same frames as before trigger the "Non-monotonous DTS"-error. But as Nevcairiel explained: The error actually happens before it reaches these frames, because ffmpeg's guess implies that there is a hole in the stream (no frames with timecodes >480ms and <560ms). If one makes a stream copy of Skipped.frames.mkv and uses -debug_ts, one can see it in the output:

demuxer -> ist_index:0 type:video next_dts:480000 next_dts_time:0.48 next_pts:480000 next_pts_time:0.48 pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 off:0 off_time:0
muxer <- type:video pkt_pts:640 pkt_pts_time:0.64 pkt_dts:480 pkt_dts_time:0.48 size:48044
[matroska @ 00000000030d9960] Starting new cluster at offset 266929 bytes, pts 640dts 480
[matroska @ 00000000030d9960] Writing block at offset 10, size 48044, pts 640, dts 480, duration 20, keyframe 1
[NULL @ 000000000039a180] ct_type:0 pic_struct:0
demuxer -> ist_index:0 type:video next_dts:500000 next_dts_time:0.5 next_pts:500000 next_pts_time:0.5 pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 off:0 off_time:0
muxer <- type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 size:10815
[matroska @ 00000000030d9960] Writing block at offset 48062, size 10815, pts 560, dts 560, duration 20, keyframe 0
[NULL @ 000000000039a180] ct_type:0 pic_struct:0
demuxer -> ist_index:0 type:video next_dts:580000 next_dts_time:0.58 next_pts:580000 next_pts_time:0.58 pkt_pts:500 pkt_pts_time:0.5 pkt_dts:500 pkt_dts_time:0.5 off:0 off_time:0
demuxer+ffmpeg -> ist_index:0 type:video pkt_pts:500 pkt_pts_time:0.5 pkt_dts:500 pkt_dts_time:0.5 off:0 off_time:0
[matroska @ 00000000030d9960] Non-monotonous DTS in output stream 0:0; previous: 560, current: 500; changing to 560. This may result in incorrect timestamps in the output file.
muxer <- type:video pkt_pts:560 pkt_pts_time:0.56 pkt_dts:560 pkt_dts_time:0.56 size:3662

The error happens during demuxing the frame with pts 560ms, where pkt_dts jumps by 80ms. In the resulting file, the aforementioned frames 25-27 have pts 560ms. So it is not a real copy. This also happens if the original file has been remuxed from a transport stream to mkv with ffmpeg (this shows that this is not a bug in ffmpeg).

But now enough with ffmpeg. If a stream doesn't contain the reorder-parameter, then couldn't a muxer like MKVToolNix find it out during muxing, determine the right value and write them in the appropriate field in the header (after all, that's what the MinCache and MaxCache-values are there, aren't they?). That's the only thing a muxer can do if I am not mistaken. (A muxer cann't just add the max_num_reorder_frames-information in the CodecPrivate: Besides the problem that TV recordings and Bluray-remuxes repeat the SPS and one can't update them all in one pass because one doesn't know the right value at the beginning there is a second problem: If one includes this value, one needs also to include some other values like log2_max_mv_length_horizontal (concerning the horizontal length of motion vectors) -- information that a muxer simply doesn't know.) The demuxer/decoder needs to be updated to use this information, too. But then this would solve this problem.

Grüße
Andi

PS: By the way: Where is the reference pseudo-cache system the specs mention explained?
PPS: Feel free to file a bug report for ffmpeg based upon the above.

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 2, 2017

Owner

This is effectively a duplicate of #1671. mkvmerge doesn't detect certain frames as I frames properly and instead treats them as B frames. Therefore the frame timing is not 100% correct.

Owner

mbunkus commented Jul 2, 2017

This is effectively a duplicate of #1671. mkvmerge doesn't detect certain frames as I frames properly and instead treats them as B frames. Therefore the frame timing is not 100% correct.

@mbunkus mbunkus closed this Jul 2, 2017

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 2, 2017

Thx you both for the answers.

In case anyone else is interested in the sample:
https://mega.nz/#F!R0cyTJoR!cs_uzSRc_1PXzHxg34GXJg

sneaker2 commented Jul 2, 2017

Thx you both for the answers.

In case anyone else is interested in the sample:
https://mega.nz/#F!R0cyTJoR!cs_uzSRc_1PXzHxg34GXJg

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 2, 2017

If one were to remux a properly muxed mkv, mp4 or (m2)ts file would one be safe from such a problem?

sneaker2 commented Jul 2, 2017

If one were to remux a properly muxed mkv, mp4 or (m2)ts file would one be safe from such a problem?

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 2, 2017

Owner

After looking some more into it I'm not so sure anymore if this is actually a duplicate. I'll re-open it for the time being.

Owner

mbunkus commented Jul 2, 2017

After looking some more into it I'm not so sure anymore if this is actually a duplicate. I'll re-open it for the time being.

@mkver

This comment has been minimized.

Show comment
Hide comment
@mkver

mkver Jul 3, 2017

My observations (I concentrate upon output.mkv; the Juddery.mkv contains another such scene at the beginning that is somehow not in input.264; has it been deleted manually?):

  1. According to h264_parse, this stream contains 5 I frames, two of them IDR frames. The non IDR I-frames are flagged as recovery points with exact_match_flag and are properly detected as keyframes by MKVToolNix. Using --engage all_i_slices_are_key_frames doesn't change anything.
  2. If I remember correctly, Blurays permit two types of H.264: Level 4.1 H.264 with a GOP length of about 24 frames (don't nail me down on the number) where usage of four slices is mandatory and level 4.0 H.264 with a GOP length double that of level 4.1. Also, slices needn't be used here. This example is level 4.0 which means that the long gap between the third and the fourth keyframe is not out of specs. This might be what led Moritz to believe that there is an undetected keyframe somewhere.
  3. MPC-HC with the (Intel) QuickSync decoder plays this file without problems. QuickSync seems to use a very high value of reorder frames so this is evidence that they are indeed the culprit.
  4. Remuxing output.mkv with ffmpeg shows what I have described in my earlier post:
I frame, track 1, timecode 3045 (00:00:03.045), size 284362, adler 0xc17340e9
B frame, track 1, timecode 2920 (00:00:02.920), size 59946, adler 0xb61693fe
P frame, track 1, timecode 3170 (00:00:03.170), size 92650, adler 0xbd86fc84
B frame, track 1, timecode 3086 (00:00:03.086), size 59565, adler 0x56bfbf64
B frame, track 1, timecode 2461 (00:00:02.461), size 136771, adler 0x6192aa90
P frame, track 1, timecode 3253 (00:00:03.253), size 58796, adler 0xc0adc069
B frame, track 1, timecode 2544 (00:00:02.544), size 76512, adler 0xae6fefa8
B frame, track 1, timecode 2502 (00:00:02.502), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 2628 (00:00:02.628), size 131606, adler 0x962235b5
B frame, track 1, timecode 2586 (00:00:02.586), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 2711 (00:00:02.711), size 87557, adler 0x74d2b9b7
B frame, track 1, timecode 2669 (00:00:02.669), size 49392, adler 0x13e053f6
P frame, track 1, timecode 2794 (00:00:02.794), size 104068, adler 0xade4e73b
B frame, track 1, timecode 2753 (00:00:02.753), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 2878 (00:00:02.878), size 91306, adler 0x0075f38f
B frame, track 1, timecode 2836 (00:00:02.836), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3003 (00:00:03.003), size 123739, adler 0x0214c68b
B frame, track 1, timecode 2961 (00:00:02.961), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3212 (00:00:03.212), size 94165, adler 0x35c6df8f
B frame, track 1, timecode 3128 (00:00:03.128), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865

becomes

I frame, track 1, timecode 3045 (00:00:03.045), size 284362, adler 0xc17340e9
P frame, track 1, timecode 2920 (00:00:02.920), size 59946, adler 0xb61693fe
P frame, track 1, timecode 3170 (00:00:03.170), size 92650, adler 0xbd86fc84
P frame, track 1, timecode 3086 (00:00:03.086), size 59565, adler 0x56bfbf64
P frame, track 1, timecode 3086 (00:00:03.086), size 136771, adler 0x6192aa90
P frame, track 1, timecode 3253 (00:00:03.253), size 58796, adler 0xc0adc069
P frame, track 1, timecode 3170 (00:00:03.170), size 76512, adler 0xae6fefa8
P frame, track 1, timecode 3170 (00:00:03.170), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 3170 (00:00:03.170), size 131606, adler 0x962235b5
P frame, track 1, timecode 3170 (00:00:03.170), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 3170 (00:00:03.170), size 87557, adler 0x74d2b9b7
P frame, track 1, timecode 3170 (00:00:03.170), size 49392, adler 0x13e053f6
P frame, track 1, timecode 3170 (00:00:03.170), size 104068, adler 0xade4e73b
P frame, track 1, timecode 3170 (00:00:03.170), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 3170 (00:00:03.170), size 91306, adler 0x0075f38f
P frame, track 1, timecode 3170 (00:00:03.170), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3170 (00:00:03.170), size 123739, adler 0x0214c68b
P frame, track 1, timecode 3170 (00:00:03.170), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3212 (00:00:03.212), size 94165, adler 0x35c6df8f
P frame, track 1, timecode 3212 (00:00:03.212), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865
  1. But it's nevertheless not a case of the number of reordering frames being unknown: The stream contains the num_reorder_frames element. And at first I wanted to write that it is too low (namely 1) and that the file is defective. But then I saw something else: Although this stream is progressive, pic_order_cnt_lsb isn't always even. It actually seems as if this stream has been produced by an encoder whose programmers didn't read the standard (or am I missing something here?). Furthermore, if the above timestamps were right, then one would have at least 6 reorder frames (there are six frames that precede the frame with adler 0x18b7eae2 in decoding order, but follow it in display order). This ment that the decoded picture buffer needed to be at least six frames big; this is outside of level 4.0 and 4.1, in other words: It wouldn't play on an ordinary Blu-ray player. I don't know what movie this is, but I think that sneaker has checked whether amazon is full of customers downvoting this Bluray because it doesn't works.
  2. So something else must be happening: From the 4th keyframe on, the poc according to h264_parse are 12, 11, 14, 13, 0, 15, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 15. This is a different order than the one MKVToolNix produces. This strange behaviour led me to take a look at the SPS again and I noticed that they change: The number of bits used to describe the frame_num and the poc changes. The actually used SPS (at the time of the issue) contains
   log2_max_frame_num_minus4: 3
   pic_order_cnt_type: 0
    log2_max_pic_order_cnt_lsb_minus4: 0

The next SPS contains:

   log2_max_frame_num_minus4: 0
   pic_order_cnt_type: 0
    log2_max_pic_order_cnt_lsb_minus4: 1

Remembering that MKVToolNix reads and caches a whole GOP at once (hence the memory issues if it doesn't find a keyframe) it seemed to me that MKVToolNix might use the next SPS too early. Therefore I cut the last GOP away, extracted the 264 stream (with mkvextract) and remuxed and voila: The file plays fine. Here are the new timestamps:

I frame, track 1, timecode 2502 (00:00:02.502), size 284362, adler 0xc17340e9
B frame, track 1, timecode 2461 (00:00:02.461), size 59946, adler 0xb61693fe
P frame, track 1, timecode 2586 (00:00:02.586), size 92650, adler 0xbd86fc84
B frame, track 1, timecode 2544 (00:00:02.544), size 59565, adler 0x56bfbf64
P frame, track 1, timecode 2669 (00:00:02.669), size 136771, adler 0x6192aa90
B frame, track 1, timecode 2628 (00:00:02.628), size 58796, adler 0xc0adc069
P frame, track 1, timecode 2753 (00:00:02.753), size 76512, adler 0xae6fefa8
B frame, track 1, timecode 2711 (00:00:02.711), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 2836 (00:00:02.836), size 131606, adler 0x962235b5
B frame, track 1, timecode 2794 (00:00:02.794), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 2920 (00:00:02.920), size 87557, adler 0x74d2b9b7
B frame, track 1, timecode 2878 (00:00:02.878), size 49392, adler 0x13e053f6
P frame, track 1, timecode 3003 (00:00:03.003), size 104068, adler 0xade4e73b
B frame, track 1, timecode 2961 (00:00:02.961), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 3086 (00:00:03.086), size 91306, adler 0x0075f38f
B frame, track 1, timecode 3045 (00:00:03.045), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3170 (00:00:03.170), size 123739, adler 0x0214c68b
B frame, track 1, timecode 3128 (00:00:03.128), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3253 (00:00:03.253), size 94165, adler 0x35c6df8f
B frame, track 1, timecode 3212 (00:00:03.212), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865

The presentation order is now the same as the one given by the poc. Here is the resulting file. By the way: The first two gops in "Juddery BluRay.REMUX.AVC.DTS-HD.MA.5.1.mkv" also use different SPS, so this is probably the same issue.
7. Just in order to be sure, I checked how the poc of the problematic GOP would look if it were be parsed according to the SPS of the next GOP. Therefore I replaced the SPS/PPS of that GOP with the next one. The result did not agree with my observation that it's using the next gop. And if I mux this into matroska, the resulting timestamps are weird, too. Here is the Matroska file, here its source.

Grüße
Andi

mkver commented Jul 3, 2017

My observations (I concentrate upon output.mkv; the Juddery.mkv contains another such scene at the beginning that is somehow not in input.264; has it been deleted manually?):

  1. According to h264_parse, this stream contains 5 I frames, two of them IDR frames. The non IDR I-frames are flagged as recovery points with exact_match_flag and are properly detected as keyframes by MKVToolNix. Using --engage all_i_slices_are_key_frames doesn't change anything.
  2. If I remember correctly, Blurays permit two types of H.264: Level 4.1 H.264 with a GOP length of about 24 frames (don't nail me down on the number) where usage of four slices is mandatory and level 4.0 H.264 with a GOP length double that of level 4.1. Also, slices needn't be used here. This example is level 4.0 which means that the long gap between the third and the fourth keyframe is not out of specs. This might be what led Moritz to believe that there is an undetected keyframe somewhere.
  3. MPC-HC with the (Intel) QuickSync decoder plays this file without problems. QuickSync seems to use a very high value of reorder frames so this is evidence that they are indeed the culprit.
  4. Remuxing output.mkv with ffmpeg shows what I have described in my earlier post:
I frame, track 1, timecode 3045 (00:00:03.045), size 284362, adler 0xc17340e9
B frame, track 1, timecode 2920 (00:00:02.920), size 59946, adler 0xb61693fe
P frame, track 1, timecode 3170 (00:00:03.170), size 92650, adler 0xbd86fc84
B frame, track 1, timecode 3086 (00:00:03.086), size 59565, adler 0x56bfbf64
B frame, track 1, timecode 2461 (00:00:02.461), size 136771, adler 0x6192aa90
P frame, track 1, timecode 3253 (00:00:03.253), size 58796, adler 0xc0adc069
B frame, track 1, timecode 2544 (00:00:02.544), size 76512, adler 0xae6fefa8
B frame, track 1, timecode 2502 (00:00:02.502), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 2628 (00:00:02.628), size 131606, adler 0x962235b5
B frame, track 1, timecode 2586 (00:00:02.586), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 2711 (00:00:02.711), size 87557, adler 0x74d2b9b7
B frame, track 1, timecode 2669 (00:00:02.669), size 49392, adler 0x13e053f6
P frame, track 1, timecode 2794 (00:00:02.794), size 104068, adler 0xade4e73b
B frame, track 1, timecode 2753 (00:00:02.753), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 2878 (00:00:02.878), size 91306, adler 0x0075f38f
B frame, track 1, timecode 2836 (00:00:02.836), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3003 (00:00:03.003), size 123739, adler 0x0214c68b
B frame, track 1, timecode 2961 (00:00:02.961), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3212 (00:00:03.212), size 94165, adler 0x35c6df8f
B frame, track 1, timecode 3128 (00:00:03.128), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865

becomes

I frame, track 1, timecode 3045 (00:00:03.045), size 284362, adler 0xc17340e9
P frame, track 1, timecode 2920 (00:00:02.920), size 59946, adler 0xb61693fe
P frame, track 1, timecode 3170 (00:00:03.170), size 92650, adler 0xbd86fc84
P frame, track 1, timecode 3086 (00:00:03.086), size 59565, adler 0x56bfbf64
P frame, track 1, timecode 3086 (00:00:03.086), size 136771, adler 0x6192aa90
P frame, track 1, timecode 3253 (00:00:03.253), size 58796, adler 0xc0adc069
P frame, track 1, timecode 3170 (00:00:03.170), size 76512, adler 0xae6fefa8
P frame, track 1, timecode 3170 (00:00:03.170), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 3170 (00:00:03.170), size 131606, adler 0x962235b5
P frame, track 1, timecode 3170 (00:00:03.170), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 3170 (00:00:03.170), size 87557, adler 0x74d2b9b7
P frame, track 1, timecode 3170 (00:00:03.170), size 49392, adler 0x13e053f6
P frame, track 1, timecode 3170 (00:00:03.170), size 104068, adler 0xade4e73b
P frame, track 1, timecode 3170 (00:00:03.170), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 3170 (00:00:03.170), size 91306, adler 0x0075f38f
P frame, track 1, timecode 3170 (00:00:03.170), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3170 (00:00:03.170), size 123739, adler 0x0214c68b
P frame, track 1, timecode 3170 (00:00:03.170), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3212 (00:00:03.212), size 94165, adler 0x35c6df8f
P frame, track 1, timecode 3212 (00:00:03.212), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865
  1. But it's nevertheless not a case of the number of reordering frames being unknown: The stream contains the num_reorder_frames element. And at first I wanted to write that it is too low (namely 1) and that the file is defective. But then I saw something else: Although this stream is progressive, pic_order_cnt_lsb isn't always even. It actually seems as if this stream has been produced by an encoder whose programmers didn't read the standard (or am I missing something here?). Furthermore, if the above timestamps were right, then one would have at least 6 reorder frames (there are six frames that precede the frame with adler 0x18b7eae2 in decoding order, but follow it in display order). This ment that the decoded picture buffer needed to be at least six frames big; this is outside of level 4.0 and 4.1, in other words: It wouldn't play on an ordinary Blu-ray player. I don't know what movie this is, but I think that sneaker has checked whether amazon is full of customers downvoting this Bluray because it doesn't works.
  2. So something else must be happening: From the 4th keyframe on, the poc according to h264_parse are 12, 11, 14, 13, 0, 15, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 15. This is a different order than the one MKVToolNix produces. This strange behaviour led me to take a look at the SPS again and I noticed that they change: The number of bits used to describe the frame_num and the poc changes. The actually used SPS (at the time of the issue) contains
   log2_max_frame_num_minus4: 3
   pic_order_cnt_type: 0
    log2_max_pic_order_cnt_lsb_minus4: 0

The next SPS contains:

   log2_max_frame_num_minus4: 0
   pic_order_cnt_type: 0
    log2_max_pic_order_cnt_lsb_minus4: 1

Remembering that MKVToolNix reads and caches a whole GOP at once (hence the memory issues if it doesn't find a keyframe) it seemed to me that MKVToolNix might use the next SPS too early. Therefore I cut the last GOP away, extracted the 264 stream (with mkvextract) and remuxed and voila: The file plays fine. Here are the new timestamps:

I frame, track 1, timecode 2502 (00:00:02.502), size 284362, adler 0xc17340e9
B frame, track 1, timecode 2461 (00:00:02.461), size 59946, adler 0xb61693fe
P frame, track 1, timecode 2586 (00:00:02.586), size 92650, adler 0xbd86fc84
B frame, track 1, timecode 2544 (00:00:02.544), size 59565, adler 0x56bfbf64
P frame, track 1, timecode 2669 (00:00:02.669), size 136771, adler 0x6192aa90
B frame, track 1, timecode 2628 (00:00:02.628), size 58796, adler 0xc0adc069
P frame, track 1, timecode 2753 (00:00:02.753), size 76512, adler 0xae6fefa8
B frame, track 1, timecode 2711 (00:00:02.711), size 58955, adler 0x18b7eae2
P frame, track 1, timecode 2836 (00:00:02.836), size 131606, adler 0x962235b5
B frame, track 1, timecode 2794 (00:00:02.794), size 50738, adler 0x16dbd92c
P frame, track 1, timecode 2920 (00:00:02.920), size 87557, adler 0x74d2b9b7
B frame, track 1, timecode 2878 (00:00:02.878), size 49392, adler 0x13e053f6
P frame, track 1, timecode 3003 (00:00:03.003), size 104068, adler 0xade4e73b
B frame, track 1, timecode 2961 (00:00:02.961), size 60364, adler 0xdbba52fb
P frame, track 1, timecode 3086 (00:00:03.086), size 91306, adler 0x0075f38f
B frame, track 1, timecode 3045 (00:00:03.045), size 53818, adler 0x65c3a461
P frame, track 1, timecode 3170 (00:00:03.170), size 123739, adler 0x0214c68b
B frame, track 1, timecode 3128 (00:00:03.128), size 56165, adler 0xc2e3f060
P frame, track 1, timecode 3253 (00:00:03.253), size 94165, adler 0x35c6df8f
B frame, track 1, timecode 3212 (00:00:03.212), size 57371, adler 0x978ae777
P frame, track 1, timecode 3295 (00:00:03.295), size 117356, adler 0x25c46865

The presentation order is now the same as the one given by the poc. Here is the resulting file. By the way: The first two gops in "Juddery BluRay.REMUX.AVC.DTS-HD.MA.5.1.mkv" also use different SPS, so this is probably the same issue.
7. Just in order to be sure, I checked how the poc of the problematic GOP would look if it were be parsed according to the SPS of the next GOP. Therefore I replaced the SPS/PPS of that GOP with the next one. The result did not agree with my observation that it's using the next gop. And if I mux this into matroska, the resulting timestamps are weird, too. Here is the Matroska file, here its source.

Grüße
Andi

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 3, 2017

the Juddery.mkv contains another such scene at the beginning that is somehow not in input.264; has it been deleted manually?):

Yes, I deleted a bit from start and end to make sure there are no stray slices because of the OpenGOP thing.

Also, slices needn't be used here. This example is level 4.0 which means that the long gap between the third and the fourth keyframe is not out of specs.

I don't see any long (>1 sec) GOP. Third and fourth I frame are only 20 frames apart. (not that it matters - mkv isn't Blu-ray)

sneaker2 commented Jul 3, 2017

the Juddery.mkv contains another such scene at the beginning that is somehow not in input.264; has it been deleted manually?):

Yes, I deleted a bit from start and end to make sure there are no stray slices because of the OpenGOP thing.

Also, slices needn't be used here. This example is level 4.0 which means that the long gap between the third and the fourth keyframe is not out of specs.

I don't see any long (>1 sec) GOP. Third and fourth I frame are only 20 frames apart. (not that it matters - mkv isn't Blu-ray)

@mkver

This comment has been minimized.

Show comment
Hide comment
@mkver

mkver Jul 3, 2017

  1. My previous post has been the subject of many revisions before I posted it. At the time I wrote about the long GOP, I just looked at output.mkv and saw that that between 1668ms and 3045ms there were no further keyframes and I explained this with the level 4.0 thing. This comment was directed at Moritz, because I thought that he mistakenly believed that the GOP was too long for a Bluray, hence there must be a keyframe in there that hasn't been properly detected. And now it turned out that the GOP in reality wasn't that long (the keyframe at 3045ms is in reality at 2502ms). Had I measured GOP length in frames (as one should do), I would have never claimed that there is a long GOP.
  2. There is no open GOP at the beginning. If you cut everything away except the first GOP, extract the elementary stream and remux it, you will get the following timestamps:
I frame, track 1, timecode 0 (00:00:00.000), size 130294, adler 0x4e017348
P frame, track 1, timecode 125 (00:00:00.125), size 69621, adler 0xf7e17aab
B frame, track 1, timecode 42 (00:00:00.042), size 91609, adler 0xc9e5244e
P frame, track 1, timecode 83 (00:00:00.083), size 32057, adler 0x6d2a5c52
P frame, track 1, timecode 209 (00:00:00.209), size 36095, adler 0x263d8f75
B frame, track 1, timecode 167 (00:00:00.167), size 27890, adler 0x3a582413
P frame, track 1, timecode 292 (00:00:00.292), size 90550, adler 0x22c18fac
B frame, track 1, timecode 250 (00:00:00.250), size 28930, adler 0x1c798252
P frame, track 1, timecode 375 (00:00:00.375), size 39904, adler 0xbfd6ebaa
B frame, track 1, timecode 334 (00:00:00.334), size 20228, adler 0x3f2b55ba
P frame, track 1, timecode 459 (00:00:00.459), size 48255, adler 0x431d0326
B frame, track 1, timecode 417 (00:00:00.417), size 19885, adler 0x0abcccb9
P frame, track 1, timecode 542 (00:00:00.542), size 37427, adler 0xf797f939
B frame, track 1, timecode 500 (00:00:00.500), size 34183, adler 0x3b6ac5e2
P frame, track 1, timecode 667 (00:00:00.667), size 66748, adler 0x5d7dca31
B frame, track 1, timecode 584 (00:00:00.584), size 36949, adler 0x839d3381
P frame, track 1, timecode 626 (00:00:00.626), size 35223, adler 0x60775a46

Then it plays without problems.

mkver commented Jul 3, 2017

  1. My previous post has been the subject of many revisions before I posted it. At the time I wrote about the long GOP, I just looked at output.mkv and saw that that between 1668ms and 3045ms there were no further keyframes and I explained this with the level 4.0 thing. This comment was directed at Moritz, because I thought that he mistakenly believed that the GOP was too long for a Bluray, hence there must be a keyframe in there that hasn't been properly detected. And now it turned out that the GOP in reality wasn't that long (the keyframe at 3045ms is in reality at 2502ms). Had I measured GOP length in frames (as one should do), I would have never claimed that there is a long GOP.
  2. There is no open GOP at the beginning. If you cut everything away except the first GOP, extract the elementary stream and remux it, you will get the following timestamps:
I frame, track 1, timecode 0 (00:00:00.000), size 130294, adler 0x4e017348
P frame, track 1, timecode 125 (00:00:00.125), size 69621, adler 0xf7e17aab
B frame, track 1, timecode 42 (00:00:00.042), size 91609, adler 0xc9e5244e
P frame, track 1, timecode 83 (00:00:00.083), size 32057, adler 0x6d2a5c52
P frame, track 1, timecode 209 (00:00:00.209), size 36095, adler 0x263d8f75
B frame, track 1, timecode 167 (00:00:00.167), size 27890, adler 0x3a582413
P frame, track 1, timecode 292 (00:00:00.292), size 90550, adler 0x22c18fac
B frame, track 1, timecode 250 (00:00:00.250), size 28930, adler 0x1c798252
P frame, track 1, timecode 375 (00:00:00.375), size 39904, adler 0xbfd6ebaa
B frame, track 1, timecode 334 (00:00:00.334), size 20228, adler 0x3f2b55ba
P frame, track 1, timecode 459 (00:00:00.459), size 48255, adler 0x431d0326
B frame, track 1, timecode 417 (00:00:00.417), size 19885, adler 0x0abcccb9
P frame, track 1, timecode 542 (00:00:00.542), size 37427, adler 0xf797f939
B frame, track 1, timecode 500 (00:00:00.500), size 34183, adler 0x3b6ac5e2
P frame, track 1, timecode 667 (00:00:00.667), size 66748, adler 0x5d7dca31
B frame, track 1, timecode 584 (00:00:00.584), size 36949, adler 0x839d3381
P frame, track 1, timecode 626 (00:00:00.626), size 35223, adler 0x60775a46

Then it plays without problems.

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 3, 2017

  1. My bad. You are correct. Zond marked it with the color for non-IDR so that got me confused. I don't know why it does that, though. Maybe because of frame_num/pic_order_cnt_lsb >1 or (missing) idr_pic_id?

sneaker2 commented Jul 3, 2017

  1. My bad. You are correct. Zond marked it with the color for non-IDR so that got me confused. I don't know why it does that, though. Maybe because of frame_num/pic_order_cnt_lsb >1 or (missing) idr_pic_id?
@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 3, 2017

Owner

Thanks for the detailed analysis.

it seemed to me that MKVToolNix might use the next SPS too early

That's a very good conjecture and quite possible! I'll verify that.

Owner

mbunkus commented Jul 3, 2017

Thanks for the detailed analysis.

it seemed to me that MKVToolNix might use the next SPS too early

That's a very good conjecture and quite possible! I'll verify that.

@mkver

This comment has been minimized.

Show comment
Hide comment
@mkver

mkver Jul 3, 2017

  1. It does that because it isn't an IDR frame; just an I frame with recovery point and no orphaned b-frames that are shared between GOPs. Yes, not using an IDR frame at this point is suboptimal, but even x264 does it that way when one uses --open-gop and the frame is detected as scenechange.
  2. h264_parse shows only even poc values for the frames in the first and the last GOP of "Juddery BluRay.REMUX.AVC.DTS-HD.MA.5.1.mkv" (the ones where the with log2_max_frame_num_minus4 equal to 0 and log2_max_pic_order_cnt_lsb_minus4 equal to 1) and odd and even values for the ones from the GOPs with the other SPS. What does Zond do in this matter? Maybe it's in reality a bug in h264_parse.

mkver commented Jul 3, 2017

  1. It does that because it isn't an IDR frame; just an I frame with recovery point and no orphaned b-frames that are shared between GOPs. Yes, not using an IDR frame at this point is suboptimal, but even x264 does it that way when one uses --open-gop and the frame is detected as scenechange.
  2. h264_parse shows only even poc values for the frames in the first and the last GOP of "Juddery BluRay.REMUX.AVC.DTS-HD.MA.5.1.mkv" (the ones where the with log2_max_frame_num_minus4 equal to 0 and log2_max_pic_order_cnt_lsb_minus4 equal to 1) and odd and even values for the ones from the GOPs with the other SPS. What does Zond do in this matter? Maybe it's in reality a bug in h264_parse.
@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 3, 2017

  1. Looks the same in Zond.

P.S.: Get free demo here (on the left).

sneaker2 commented Jul 3, 2017

  1. Looks the same in Zond.

P.S.: Get free demo here (on the left).

@mbunkus mbunkus closed this in def48b2 Jul 4, 2017

mbunkus added a commit that referenced this issue Jul 4, 2017

HEVC/h.265 parser: flush queued frames on SPS/PPS changes
Whenever a sequence parameter set or picture parameter set
changes (meaning an SPS with the same ID as an earlier SPS but with
different content is found), all frames queued for order & timestamp
calculation must be flushed. Otherwise frame order calculation will be
based on wrong values for some frames and on correct values for other
frames.

This is the HEVC/h.265 equivalent of #2028.
@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 4, 2017

Owner

Should be fixed now. Playback is smooth, and ffmpeg doesn't emit any warning anymore.

New pre-builds are compiling and will be available shortly (build numbers 01520 and higher).

Owner

mbunkus commented Jul 4, 2017

Should be fixed now. Playback is smooth, and ffmpeg doesn't emit any warning anymore.

New pre-builds are compiling and will be available shortly (build numbers 01520 and higher).

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 4, 2017

With the pre-build it seems SPS/PPS of second GOP end up in CodecPrivate. Is this intended? I used "full_sample.264" to test. (on ftp and Mega)

sneaker2 commented Jul 4, 2017

With the pre-build it seems SPS/PPS of second GOP end up in CodecPrivate. Is this intended? I used "full_sample.264" to test. (on ftp and Mega)

@mbunkus

This comment has been minimized.

Show comment
Hide comment
@mbunkus

mbunkus Jul 4, 2017

Owner

Not really intended, but it shouldn't matter as mkvmerge has to prefix all I frames with the currently active SPS/PPS anyway.

Owner

mbunkus commented Jul 4, 2017

Not really intended, but it shouldn't matter as mkvmerge has to prefix all I frames with the currently active SPS/PPS anyway.

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 4, 2017

I see. Thx (both of you). I can confirm playback is fine and ffmpeg errors gone when muxed from elementary stream using latest pre-build.

sneaker2 commented Jul 4, 2017

I see. Thx (both of you). I can confirm playback is fine and ffmpeg errors gone when muxed from elementary stream using latest pre-build.

@mkver

This comment has been minimized.

Show comment
Hide comment
@mkver

mkver Jul 5, 2017

I can confirm that for some reason, not the first SPS/PPS lands in CodecPrivate for "full_sample.264". But this happens with both 13.0 and the pre-build; I don't see a change.
PS: The "res: fixed/implemented" label has not been set.

mkver commented Jul 5, 2017

I can confirm that for some reason, not the first SPS/PPS lands in CodecPrivate for "full_sample.264". But this happens with both 13.0 and the pre-build; I don't see a change.
PS: The "res: fixed/implemented" label has not been set.

@sneaker2

This comment has been minimized.

Show comment
Hide comment
@sneaker2

sneaker2 Jul 5, 2017

Yes, it seems contrary to what I said 13.0.0 (and 12.0.0) behave the same way. It's not a change of the pre. I must have mixed up the files (yet again).

sneaker2 commented Jul 5, 2017

Yes, it seems contrary to what I said 13.0.0 (and 12.0.0) behave the same way. It's not a change of the pre. I must have mixed up the files (yet again).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.