-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusing duration change consequences and inconsistent outcome for user agents. #124
Comments
Also pointing out another discrepancy related to the duration. Say we have the following two tracks, one audio and one video. after a call to each sourcebuffer with remove(3,6) we now have: doing mediaSource.duration = 3 a follow up call to mediaSource.endOfStream() will now call the duration change algorithm with the value of the highest end time which is now 5. and so mediaSource.duration is now 5. So, 3, 4 or 5 are all possible returned values when setting the duration to 3. BTW, this exact scenario is happening in the media-source/mediasource-duration.html platform test (available at https://github.com/w3c/web-platform-tests/blob/master/media-source/mediasource-duration.html) (which now reports failure after implementing the new MSE spec) |
@jyavenard Thanks for reporting.
@jdsmith3000 What do you think? |
I don't really understand what the ability to render partial frames has anything to do with the change of duration, the removal of frames via the range removal algorithm or buffered range calculation. The range removal algorithm will never truncate a frame; either a whole frame is removed or none of it. The buffered range calculation also only uses whole frames (as they are now stored in the track buffer). The only time partial rendering of frames matters is during the coded frame processing algorithm where a new added frame may overlap an existing one. But once the overlap is done, the frame can be considered as whole again (its end timestamp has been adjusted and the original duration of that frame is no longer available). It could have mattered in the past, where changing the duration could have removed frames (and one could have considered the possibility of truncating partial frames then). As such, should the ability to render partial frame matters at all when it comes to reporting the buffered range or changing the duration? I think not. So I believe that the ability to render partial frames should be excluded. It can still be allowed considering the above (and it's quite nice feature for gapless transition and so on). The inconsistent behaviour of the duration change algorithm makes it difficult to write universal tests such as the web platform test. |
I'll also investigate how Chromium does this, since that might provide the least-regressing route to clarification in the spec. I suspect it caps the buffered range end times to not exceed the current duration value, thereby side-stepping some of the resulting spec confusion. |
Indeed, Chromium caps SourceBuffer.buffered's highest end time to be <= highest_end_time (per spec), and also to <= current duration (not in spec, but a perhaps helpful clarification): https://cs.chromium.org/chromium/src/media/filters/chunk_demuxer.cc?sq=package:chromium&rcl=1469027825&l=153 |
That could be an easy alternative, especially now that we can't set a duration to a value that would require removing samples from the source buffer. However, we have to be very careful when drafting this and not introduce a cycling-dependency. What was the reason for chromium to implement that change? it's disappointing when a user agent do their own things, with chromium market share that now makes it a de-facto rule. |
I am looking at this now. It is true that the partial audio/text frame logic can lead to different durations based on implementation ability, but it would be consistent for a given UA, right? I do wonder about the benefit of allow this variance. It seems like the rules for implementations that can't handle the partial frames could perhaps be applied everywhere. That way, duration would always be greater than or equal to what's buffered. @wolenetz: Is this what you are suggesting? |
@jdsmith3000 and @wolenetz: Should this be tagged V1Editorial? |
I'm looking into our implementation. Having the can handle it/can't handle it options seems undesirable. @wolenetz Please confirm that you adjust duration to new duration always when partial audio or text tracks are buffered beyond duration limits. I'm not sure if this is just editorial, since it could affect implementations. |
@jyavenard (#124 (comment)): Chromium's detection of playback reaching end of media resource is a bit broken: it renders everything read from each of the streams until all the currently active streams have signalled that the end of stream has been reached. This should only occur if endOfStream() was previously called. The actual duration of rendered media might be beyond the set duration, since nothing in Chromium's duration change algorithm updates the streams or renderers to indicate a partial frame. Hence, Chromium appears to be doing a hybrid approach of the duration change logic (it effectively doesn't support partial end-of-media-resource-frame-rendering), yet it clamps the buffered range logic to make it look somewhat like it does support such partial frame rendering. Plan A: I propose, to not break existing web apps which might get confused if indeed a buffered range end time was seen to be > duration (not possible in Chromium's hybrid/broken implementation currently), to:
@jyavenard : would this approach best match FF's implementation? Per @jdsmith3000 (who's currently OoO) in #124 (comment), it seems Edge might already not support handling partial frames. If no UA currently handles partial end-of-media-resource frame rendering, the net observable change would be:
This seems partially substantive, especially if these net observable changes are important to web authors, or any UA makes reliance on the previous allowance for partial end-of-media-resource frame rendering. @paulbrucecotton Please advise on potentially needing a CfC for resolving this. @jyavenard please advise on impact to FF implementation. --edit: marked this proposal as Plan A. |
Plan B: As one alternative to the approach presented in #124 (comment), and with less substantive change, we could reduce the net observable change by dropping 2.c. This still assumes no UA currently supports partial end-of-media-resource frame rendering, but still fixes the existing problem around muxed SourceBuffers. As in Plan A, Chromium would need to be fixed to not do its non-standard buffered range clipping to duration (which is effectively a no-op due to also fixing Chromium to comply with the steps around not supporting partial end-of-media-resource frame rendering). |
ok, I understand the problem you were describing at first (it got me confused for a while). Ignoring duration change for the moment So duration will always be superior or equal to the highest end time found in any track. I'm rather confused on where this talk about rendering partial frames originate from and why it has any impact on the way duration is calculated. However, as currently the duration change does take into account the ability to render partial frame. So it's the only place where this partial frame business is coming into account. If neither Blink, Edge or Firefox, and I'm guessing neither webkit have this ability (not that it would be a hard thing to do, Firefox in particular already has the ability to seek within a frame) perhaps removing that ambiguity altogether is the best thing to do. BTW, Firefox does the same as what you describe as to how frames are played, when endOfStream has been called, and normal playback is going (e.g. not seeking) it will simply demux all frames until each stream reports EOS and it will continue playing until there are no more frames to decode. What I do believe however, is that if Chrome is doing something in a particular, yet reasonable way, then we should amend the spec so that it's well documented and prevent getting other UA to get bug reports because "Chrome doesn't do it that way" :) |
I'll produce a PR. |
I think Plan B would be best (less substantive, and also retains some notification to web apps that something about duration has changed, in this (redundant durationchange) case it indicates that the desired duration (different than current duration) was adjusted. only in the case where the desired duration is exactly the current duration will durationchange be suppressed. |
I agree with removing text here about partial frame rendering. @wolenetz: I believe you plan to remove step 4 (the partial frame conditional statement) and renumber 4.1 and 4.2 as steps that always execute. Correct? If so, that makes sense to me. |
On looking deeper into this, I wonder if this might break web apps, if going with plan A or B instead of allowing for behavior like what Chromium's hybrid currently implements (giving apps the impression of support for partial frame rendering, capping buffered range highest end times to duration, yet perhaps for some stream types like audio playing the last few decoded samples which might be beyond the duration (but within a coded frame presentation interval which begins at or below duration). App breakage or implementation/spec fragility arises out of plan A or B in the cases where duration is set by the app but results in something the app wasn't expecting (a higher duration). In degenerate cases, such as an extremely long text track cue, the resulting duration could be huge. So, plan C will be what I'll produce a PR for. It will include a non-normative note which allows implementations which cannot render partial audio frames to play those last few samples, so long as the app-visible currentTime never exceeds duration (Chromium looks like it might have a bug here, where currentTIme might exceed duration even in this current implementation). @jdsmith3000 @jyavenard Please comment ASAP if you think Plan C is severely flawed. |
I disagree. Plan B could change behavior if a particular UA supports partial frame playback now and changes to the revised spec that eliminates it; but it standardizes on a valid behavior under the current spec and so should not be considered a breaking change. The Chrome hybrid, if anything, changes playback behavior because it poses as a partial frame UA, but play beyond duration. How can that be better? |
Plan B's change of "duration" to something the app did not expect afford less interoperability than a few milliseconds of audio perhaps playing beyond the duration. Note that the allowance in Plan C is not normative; implementations can choose to improve their quality and not play those few decoded audio samples that exceed duration. Even without that improvement, plan C forbids implementations from reporting a currentTime value that exceeds duration. The concern here is of app breakage, which I believe you also mentioned concerns about during the editors' sync yesterday, @jdsmith3000. |
Following discussion in #148, the plan B is the way forward. I'm producing a new PR for that now. -- edit: undid accidental issue closure.. |
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu --HG-- extra : rebase_source : b25f0e2a76c517c0dca0a9def00edd6eff38d8ad
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu UltraBlame original commit: 7f33fad15f7c48a7544f2045179f36bfee81a9be
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu UltraBlame original commit: 7f33fad15f7c48a7544f2045179f36bfee81a9be
…c. r=gerald The MSE spec was recently updated to use the highest end time across tracks rather than across the buffered ranges. See w3c/media-source#124 and the fix described in w3c/media-source#154 MozReview-Commit-ID: 4CqI8d2e9gu UltraBlame original commit: 7f33fad15f7c48a7544f2045179f36bfee81a9be
In the updated spec, we have:
http://w3c.github.io/media-source/#duration-change-algorithm
I'll explain the issue with an example. Assuming we have a video sourcebuffer with the following 3 samples:
[0,2)[2,4)[4,6)
if we call remove with the arguments (3, 6), the resulting sourcebuffer will now contain:
[0,2)[2,4)
as a sample is only removed as : "Remove all media data, from this track buffer, that contain starting timestamps greater than or equal to start and less than the remove end timestamp." (from Coded Frame Removal)
The sample [2,4) isn't truncated.
When setting duration to 3 however, if the usage agent is able to partially render audio frames, then the resulting duration will stay 3.
However, for a user agent enable to partially render audio frames, the duration will now be
so duration for those user agents will be 4 instead.
Both type of user agents will have a buffered range of [0,4), yet one has a duration of 3 and the other 4 (which is even more confusing as we do not have to worry about audio frames here)
I don't believe that there is anything in the spec, indicating that playback would have to stop once currentTime reaches duration; on the contrary:
https://w3c.github.io/html/semantics-embedded-content.html#offsets-into-the-media-resource
When played, that last sample will update the duration, in which case once playback has ended we would have a final duration = 4.
As such, I believe that
would need to be removed.
Additionally, as side effects of those steps that change the duration to what was originally set.
if you do:
mediaSource.duration = 3;
mediaSource.duration = 3;
as in both cases the duration was actually set to 4, the first step of the duration change algorithm
will always be false; and the durationchange event will be fired twice (this issue arised in the web-platform-test media-source/mediasource-duration.html which specifically check that durationchange event is not fired when setting the duration to the same value multiple times in a row).
As a workaround for this issue, I suggest that in the final step, prior "Update the media duration to new duration and run the HTMLMediaElement duration change algorithm."
we do: "If the value of old duration is equal to new duration, then return."
This will ensure that no durationchange event will be fired.
The text was updated successfully, but these errors were encountered: