-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coded frame processing is not clear around what to do for SAP Type 2 overlaps #187
Comments
@acolwell I did some spelunking through the repo history, and the text around supporting SAP Type 1 and 2 existed in the first commit: e657ebf. Needless to say, it appears SAP Type 2 support (where PTS of later frame in decode order can precede PTS of random access point) has been allowed since MSE was proposed. Am I missing something? WDYT about adding |
@plehegar @paulbrucecotton, would adding something like |
For H264, Firefox ignores the container's sample table flags, and instead uniquely rely on the frame containing a NAL of type IDR_SLICE When we did rely on the container's sample table flags, we use the same logic as FFmpeg to determine if the frame depended on any others: For all other codec (except audio, where we marked all frames as keyframe): An old comment I wrote on the matter: with an extra note "The flag sample_is_non_sync_sample provides the same information as the sync sample table [8.6.2]. When this value is set 0 for a sample, it is the same as if the sample were not in a movie fragment and marked with an entry in the sync sample table (or, if all samples are sync samples, the sync sample table were absent).”. Here we have no sync table (no stss box). So we use the mask 0x01010000; if the operation sampleFlag & 0x1010000 gives us 0, we assume it’s a keyframe ." I have never seen a case where the frames marked as random access point in the container weren't the first frame of the GOP (in both pts and dts order). So in short, does it matter ? :) |
It would be excellent if this were truly the case (as it had been my assumption until early last week, when reparsing the ISOBMFF spec text around SAP Type 2 indicated that we must support this case you haven't ever seen). However, we've seen folks at least having problems with similar streams in non-MSE case, and we have seen some cameras that produce SAP Type 2 streams. I'll add some logging to Chrome to detect this case. Do you have any Firefox telemetry confirming your statement that no-one ever uses H264 with SAP Type 2? |
@wolenetz I believe I only included SAP Type 2 because I was just quickly looking to see which types mapped to "closed GOPs". I likely didn't pay attention to the fact that it allowed the presentation time of a later access unit to come before the first access unit. I suspect SAP Type 2 is pretty rare since it could cause all sorts of complications. My initial instinct is to just drop frames that have presentation times before the PTS of the first access unit, but I suspect if the encoder is sufficiently crazy this could break decode dependencies. I suppose you would also wiggle out of this by dropping the SAP Type 2 requirement since you likely didn't prove interop on it. :) I suspect trying to support later access units with earlier presentation times will open a can of worms that just isn't worth solving. I can't really see a way to do it without having full dependency knowledge of the full GOP because the earliest frame could be the last one in decode order if you happen to have a particularly cruel encoder (say one that encodes frames in reverse order!). |
The other difficulty added here is that without deep analysis of the h264 stream, there's no real way to know. It would be really unfortunate to change this... |
OK. I'll add some telemetry logging to Chrome and proceed under the assumption that SAP Type 2 occurrences, if they occur, are rare. Based on the results of such telemetry and perhaps further discussion, we might want to change this issue to "Deprecate ISOBMFF SAP Type 2 support". |
The W3C Process document clearly defines classes of changes. It would appear that this may be a Type 3 change ie one that might From the current HME WG charter:
HME WG is currently chartered to do only Type 1, Type 2 and Type 3 changes. If I am correct that your proposed change is a Type 3 change then it is scope of the HME WG. /paulc |
FYI, in GPAC, we have never seen SAP Type 2 used. |
@cconcolato (#187 (comment)) that is more excellent news. Thank you for providing that data point. |
With https://chromium-review.googlesource.com/c/563017/2 applied to Chrome to detect SAP Type 2, spot-checking several sites showed that at least nest.com's desktop site uses SAP Type 2 + MSE; youtube and netflix desktop site spot-checks showed only SAP Type 1. So it appears SAP Type 2 is used at least in some cases. I'm not certain this warrants adding |
If a random access point doesn't have the earliest presentation time of other frames that depend on it (eg, other frames in later decode time up until the next random access point), the MSE spec was not designed to support processing and buffering it well. With the change to managing and reporting buffered ranges by PTS intervals instead of DTS intervals, this could impact interop. This change detects this general case and logs once per track to chrome://media-internals. Later changes might include telemetry collection to assist removing or fixing support for at least SAP Type 2 in the MSE ISOBMFF bytestream spec. To verify the new log is emitted by the new test, this change also upgrades FrameProcessorTest's |media_log_| to a StrictMock<MockMediaLog> and includes new strict verification of logs emitted during FrameProcessorTests. See also related spec issue w3c/media-source#187 BUG=739931,718641 Change-Id: I361177dee6a5c70edf17bdbde2f3ea643977e6ec Reviewed-on: https://chromium-review.googlesource.com/563017 Commit-Queue: Chrome Cunningham <chcunningham@chromium.org> Reviewed-by: Chrome Cunningham <chcunningham@chromium.org> Cr-Commit-Position: refs/heads/master@{#485125}
Upon further thought, the use of SAP type 2 will have no negative impact with firefox MSE stack. There's no need for the I-Frame to be the earliest presented frame of the GOP. |
So far, preliminary (Chrome M61 - dev channel) shows a small % of occurrences of SAP-type-2 usage in MSE. However, this data is very preliminary, I'll update with public numbers once the related telemetry logging is in a more significant portion of Chrome usage. |
If a random access point doesn't have the earliest presentation time of other frames that depend on it (eg, other frames in later decode time up until the next random access point), the MSE spec was not designed to support processing and buffering it well. With the change to managing and reporting buffered ranges by PTS intervals instead of DTS intervals, this could impact interop. This change detects this general case and logs once per track to chrome://media-internals. Later changes might include telemetry collection to assist removing or fixing support for at least SAP Type 2 in the MSE ISOBMFF bytestream spec. To verify the new log is emitted by the new test, this change also upgrades FrameProcessorTest's |media_log_| to a StrictMock<MockMediaLog> and includes new strict verification of logs emitted during FrameProcessorTests. See also related spec issue w3c/media-source#187 BUG=739931,718641 Change-Id: I361177dee6a5c70edf17bdbde2f3ea643977e6ec Reviewed-on: https://chromium-review.googlesource.com/563017 Commit-Queue: Chrome Cunningham <chcunningham@chromium.org> Reviewed-by: Chrome Cunningham <chcunningham@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#485125} Cr-Mirrored-From: https://chromium.googlesource.com/chromium/src Cr-Mirrored-Commit: 495493bad91c1fb4dd30462bb04b6f79c96ecc88
WebKit has received reports that this situation seems to be occurring regularly on CNN.com videos, e.g. this one. Overlapping appends of SAP type 2 content causes "green flashes" where the decoder tries to apply a "type 2" B-frame to a "type 1" I-frame. |
(Though technically, the content seems to be MPEG-2TS, not strictly ISOBMFF content. But since TS is supported as a MSE stream type, the issue still needs fixing somehow.) |
[MSE] Decode glitches when watching videos on CNN.com https://bugs.webkit.org/show_bug.cgi?id=206412 <rdar://problem/55685630> Reviewed by Xabier Rodriguez-Calvar. Source/WebCore: Test: media/media-source/media-source-samples-out-of-order.html The "Coded frame processing" algorithm has a known shortcoming <w3c/media-source#187> when dealing appends of with "SAP Type 2" content, or in general terms, appending data where the resulting samples have presentation times that do not increase monotonically. When this occurs, the ordering of samples in presentation time will be different from the ordering of samples in decode time. The decoder requires samples to be enqueued in decode time order, but the MSE specification only checks for overlapping samples in presentation time order. During appends of out-of-order samples, this can lead to new samples being inserted between a previously appended sample and the sample on which that sample depends. To resolve this, add a new step in the implementation of the "coded frame processing" algorithm in SourceBuffer::sourceBufferPrivateDidReceiveSample(). When the incoming frame is a sync sample, search forward in the TrackBuffer for all previous samples in between the new sync sample, and the next sync sample. All the samples found in this step would fail to decode correctly if enqueued after the new (possibly different resolution) sync sample, so they are removed in this step. * Modules/mediasource/SampleMap.cpp: (WebCore::DecodeOrderSampleMap::findSampleAfterDecodeKey): * Modules/mediasource/SampleMap.h: * Modules/mediasource/SourceBuffer.cpp: (WebCore::SourceBuffer::sourceBufferPrivateDidReceiveSample): LayoutTests: * media/media-source/media-source-samples-out-of-order-expected.txt: Added. * media/media-source/media-source-samples-out-of-order.html: Added. git-svn-id: https://svn.webkit.org/repository/webkit/trunk@254761 268f45cc-cd09-0410-ab3c-d52691b4dbfc git-svn-id: http://svn.webkit.org/repository/webkit/branches/safari-609-branch@255025 268f45cc-cd09-0410-ab3c-d52691b4dbfc
[MSE] Decode glitches when watching videos on CNN.com https://bugs.webkit.org/show_bug.cgi?id=206412 <rdar://problem/55685630> Reviewed by Xabier Rodriguez-Calvar. Source/WebCore: Test: media/media-source/media-source-samples-out-of-order.html The "Coded frame processing" algorithm has a known shortcoming <w3c/media-source#187> when dealing appends of with "SAP Type 2" content, or in general terms, appending data where the resulting samples have presentation times that do not increase monotonically. When this occurs, the ordering of samples in presentation time will be different from the ordering of samples in decode time. The decoder requires samples to be enqueued in decode time order, but the MSE specification only checks for overlapping samples in presentation time order. During appends of out-of-order samples, this can lead to new samples being inserted between a previously appended sample and the sample on which that sample depends. To resolve this, add a new step in the implementation of the "coded frame processing" algorithm in SourceBuffer::sourceBufferPrivateDidReceiveSample(). When the incoming frame is a sync sample, search forward in the TrackBuffer for all previous samples in between the new sync sample, and the next sync sample. All the samples found in this step would fail to decode correctly if enqueued after the new (possibly different resolution) sync sample, so they are removed in this step. * Modules/mediasource/SampleMap.cpp: (WebCore::DecodeOrderSampleMap::findSampleAfterDecodeKey): * Modules/mediasource/SampleMap.h: * Modules/mediasource/SourceBuffer.cpp: (WebCore::SourceBuffer::sourceBufferPrivateDidReceiveSample): LayoutTests: * media/media-source/media-source-samples-out-of-order-expected.txt: Added. * media/media-source/media-source-samples-out-of-order.html: Added. Identifier: 219507@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@254761 268f45cc-cd09-0410-ab3c-d52691b4dbfc Identifier: 218903.102@safari-609-branch git-svn-id: https://svn.webkit.org/repository/webkit/branches/safari-609-branch@255025 268f45cc-cd09-0410-ab3c-d52691b4dbfc
https://bugs.webkit.org/show_bug.cgi?id=206412 <rdar://problem/55685630> Reviewed by Xabier Rodriguez-Calvar. Source/WebCore: Test: media/media-source/media-source-samples-out-of-order.html The "Coded frame processing" algorithm has a known shortcoming <w3c/media-source#187> when dealing appends of with "SAP Type 2" content, or in general terms, appending data where the resulting samples have presentation times that do not increase monotonically. When this occurs, the ordering of samples in presentation time will be different from the ordering of samples in decode time. The decoder requires samples to be enqueued in decode time order, but the MSE specification only checks for overlapping samples in presentation time order. During appends of out-of-order samples, this can lead to new samples being inserted between a previously appended sample and the sample on which that sample depends. To resolve this, add a new step in the implementation of the "coded frame processing" algorithm in SourceBuffer::sourceBufferPrivateDidReceiveSample(). When the incoming frame is a sync sample, search forward in the TrackBuffer for all previous samples in between the new sync sample, and the next sync sample. All the samples found in this step would fail to decode correctly if enqueued after the new (possibly different resolution) sync sample, so they are removed in this step. * Modules/mediasource/SampleMap.cpp: (WebCore::DecodeOrderSampleMap::findSampleAfterDecodeKey): * Modules/mediasource/SampleMap.h: * Modules/mediasource/SourceBuffer.cpp: (WebCore::SourceBuffer::sourceBufferPrivateDidReceiveSample): LayoutTests: * media/media-source/media-source-samples-out-of-order-expected.txt: Added. * media/media-source/media-source-samples-out-of-order.html: Added. Canonical link: https://commits.webkit.org/219507@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@254761 268f45cc-cd09-0410-ab3c-d52691b4dbfc
[MSE] Decode glitches when watching videos on CNN.com https://bugs.webkit.org/show_bug.cgi?id=206412 <rdar://problem/55685630> Reviewed by Xabier Rodriguez-Calvar. Source/WebCore: Test: media/media-source/media-source-samples-out-of-order.html The "Coded frame processing" algorithm has a known shortcoming <w3c/media-source#187> when dealing appends of with "SAP Type 2" content, or in general terms, appending data where the resulting samples have presentation times that do not increase monotonically. When this occurs, the ordering of samples in presentation time will be different from the ordering of samples in decode time. The decoder requires samples to be enqueued in decode time order, but the MSE specification only checks for overlapping samples in presentation time order. During appends of out-of-order samples, this can lead to new samples being inserted between a previously appended sample and the sample on which that sample depends. To resolve this, add a new step in the implementation of the "coded frame processing" algorithm in SourceBuffer::sourceBufferPrivateDidReceiveSample(). When the incoming frame is a sync sample, search forward in the TrackBuffer for all previous samples in between the new sync sample, and the next sync sample. All the samples found in this step would fail to decode correctly if enqueued after the new (possibly different resolution) sync sample, so they are removed in this step. * Modules/mediasource/SampleMap.cpp: (WebCore::DecodeOrderSampleMap::findSampleAfterDecodeKey): * Modules/mediasource/SampleMap.h: * Modules/mediasource/SourceBuffer.cpp: (WebCore::SourceBuffer::sourceBufferPrivateDidReceiveSample): LayoutTests: * media/media-source/media-source-samples-out-of-order-expected.txt: Added. * media/media-source/media-source-samples-out-of-order.html: Added. Canonical link: https://commits.webkit.org/219507@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@254761 268f45cc-cd09-0410-ab3c-d52691b4dbfc Canonical link: https://commits.webkit.org/218903.102@safari-609-branch git-svn-id: https://svn.webkit.org/repository/webkit/branches/safari-609-branch@255025 268f45cc-cd09-0410-ab3c-d52691b4dbfc
[MSE] Decode glitches when watching videos on CNN.com https://bugs.webkit.org/show_bug.cgi?id=206412 <rdar://problem/55685630> Reviewed by Xabier Rodriguez-Calvar. Source/WebCore: Test: media/media-source/media-source-samples-out-of-order.html The "Coded frame processing" algorithm has a known shortcoming <w3c/media-source#187> when dealing appends of with "SAP Type 2" content, or in general terms, appending data where the resulting samples have presentation times that do not increase monotonically. When this occurs, the ordering of samples in presentation time will be different from the ordering of samples in decode time. The decoder requires samples to be enqueued in decode time order, but the MSE specification only checks for overlapping samples in presentation time order. During appends of out-of-order samples, this can lead to new samples being inserted between a previously appended sample and the sample on which that sample depends. To resolve this, add a new step in the implementation of the "coded frame processing" algorithm in SourceBuffer::sourceBufferPrivateDidReceiveSample(). When the incoming frame is a sync sample, search forward in the TrackBuffer for all previous samples in between the new sync sample, and the next sync sample. All the samples found in this step would fail to decode correctly if enqueued after the new (possibly different resolution) sync sample, so they are removed in this step. * Modules/mediasource/SampleMap.cpp: (WebCore::DecodeOrderSampleMap::findSampleAfterDecodeKey): * Modules/mediasource/SampleMap.h: * Modules/mediasource/SourceBuffer.cpp: (WebCore::SourceBuffer::sourceBufferPrivateDidReceiveSample): LayoutTests: * media/media-source/media-source-samples-out-of-order-expected.txt: Added. * media/media-source/media-source-samples-out-of-order.html: Added. git-svn-id: https://svn.webkit.org/repository/webkit/trunk@254761 268f45cc-cd09-0410-ab3c-d52691b4dbfc git-svn-id: https://svn.webkit.org/repository/webkit/branches/safari-609-branch@255025 268f45cc-cd09-0410-ab3c-d52691b4dbfc
Chrome has a user that reported SAP-Type-2 buffering causing gaps, due to buffering segments backwards in (chunk3,2,1 for example). Obtaining clarity in the spec would help implementations both support SAP-Type-2 in MSE and also improve interop for such media. See https://crbug.com/1008756 |
SAP Type 2 (allowed for {audio,video}/mp4) random access points can have a frame later in decode sequence than the random access point, but with a lower presentation time than the random access point.
Suppose there is some buffered video (in presentation timeline) (caps=SAP Type 2 random access point, # = decode sequence)
[a2][A0][a1]
And a new decode sequence (e.g., following an abort() or with a decode sequence discontinuous with previous append) is appended: [B0][b1][b2], having presentation time sequence [b2][B0][b1], with [b2]'s presentation interval overlapping the presentation time of [a1], but neither [B0] nor [b1] overlap any of the previously buffered video.
In the current coded frame processing algorithm, [a1] would be retained. Both [b2] and [a1] would have the same presentation time: which should be retained? Should the overlap of [a1] by [b2] cause [a1] to be removed? (And [a1]'s removal would also trigger the removal of [a2] because the it has decode dependency on [a1].) If you seek to the PTS of [b2] / [a1], which should show?
The coded frame processing algorithm already dictates what to do for the first frame following a decode discontinuity: if it overlaps the presentation start time of some previously buffered frame, remove that previously buffered frame and everything previously buffered in decode order from that removed frame until the next previously buffered random access point. It also maintains a track buffer
highest end time
for use in removing previously buffered media (from a previous coded frame group) overlapped by the new coded frame group as its presentation interval grows into the future.However, for SAP Type 2 random access points allow a coded frame group's presentation interval to grow into the past, too.
This seems to strongly motivate also tracking a track buffer
lowest start time
, and adding overlap removal logic to the spec text.This hasn't been a problem in Chrome yet due to it having incorrectly been using decode time intervals, not presentation time intervals, for managing all buffered range overlaps and reporting buffered range times; now that I'm fixing that implementation to be correct, I see that SAP Type 2 appears to be insufficiently supported by the MSE spec.
@jdsmith3000 @jyavenard @paulbrucecotton FYI - please comment.
-edited to correct to @paulbrucecotton
The text was updated successfully, but these errors were encountered: