Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is "running the Encrypted Block Encountered algorithm" the correct way to Attempt to Resume Playback If Necessary? #100

Closed
ddorwin opened this issue Oct 19, 2015 · 20 comments
Assignees

Comments

@ddorwin
Copy link
Contributor

ddorwin commented Oct 19, 2015

The Initialization Data Encountered and Encrypted Block Encountered algorithms have steps to "Wait for a signal to resume playback."

This wait prevents further processing of the media data, which is an easy way to ensure we know the state of playback. However, this entirely blocks the resource fetch algorithm, and it's weird to block in an algorithm. So, the first question is whether this is correct.

Second, the Attempt to Resume Playback If Necessary algorithm "Attempt[s] to resume the resource fetch algorithm by running the Encrypted Block Encountered algorithm." Running this algorithm while currently waiting in the algorithm doesn't seem correct. It seems even more incorrect when waiting in a different algorithm, which is what happens when the setMediaKeys() algorithm invokes Attempt to Resume Playback If Necessary algorithm after the Initialization Data Encountered algorithm ends in the waiting step. It's also not clear that running the Encrypted Block Encountered algorithm is the next logical step after unblocking the Initialization Data Encountered algorithm. (Note that it is also possible that setMediaKeys() would unblock the wait in the Encrypted Block Encountered algorithm in some implementations.)

We should a) consider another way of stalling the processing of media data and b) determine the best way to resume based on the outcome of (a).

@mwatson2
Copy link
Contributor

I think the solution is to suspend the resource fetch algorithm instead of blocking in our Encrypted Block Encountered algorithm. Then the "attempt to resume playback" would check whether it (the resource fetch algorithm) can resume (by explicitly checking the state of the key that is needed) and move it back out of the suspended state.

The resource fetch algorithm already has a "suspended" state:

User agents may decide to not download more content at any time, e.g. after buffering five minutes of a one hour media resource, while waiting for the user to decide whether to play the resource or not, while waiting for user input in an interactive resource, or when the user navigates away from the page. When a media element's download has been suspended, the user agent must queue a task, using the DOM manipulation task source, to set the networkState to NETWORK_IDLE and fire a simple event named suspend at the element. If and when downloading of the resource resumes, the user agent must queue a task to set the networkState to NETWORK_LOADING. Between the queuing of these tasks, the load is suspended (so progress events don't fire, as described above).

To maintain the behavior as implied by the existing specification, we need to suppress the networkState change (IIUC, because the present spec simply blocks in the middle of the resource fetch algorithm, it does indeed stop network downloading, but there is no networkState change because we are not actually suspending the algorithm, just blocking in the middle. In practice we might want downloading to continue even if decryption was blocked, but that is a separate issue).

@paulbrucecotton
Copy link

The Sapporo F2F attendees agreed with @mwatson2's proposal "I think the solution is to suspend the resource fetch algorithm instead of blocking in our Encrypted Block Encountered algorithm. "
http://www.w3.org/2015/10/30-html-media-minutes.html#item04

@paulbrucecotton
Copy link

@ddorwin - Do you agree with the F2F meetings' proposal?

@ddorwin
Copy link
Contributor Author

ddorwin commented Nov 17, 2015

Suspending and resuming seems fine. The details might need to be worked out. Is suspending and resuming algorithms like this common in other specs? (I see "is suspended" several places in the HTMLMediaElement spec, but no explicit suspending.)

The resource fetch algorithm covers decoding, but there is no explicit decode or fetch (after the initial fetch) step. These things could actually run in parallel. Thus, perhaps we should be suspending playback rather than the resource fetch algorithm.

Also, I think we could encounter an encrypted block after the this algorithm and the overall resource selection algorithm have been completed and aborted, respectively. (Step 5 of the resource fetch algorithm.) This makes me wonder whether any references to the resource fetch algorithm are actually correct.

@foolip, any comments or suggestions?

@foolip
Copy link
Member

foolip commented Nov 17, 2015

I think I need a little more context. Since we're talking about buffering policies, I assume we're talking about plain HTTP resources where that's up to the UA, not MSE. The high-level question is what to do after the waitingforkey event has been fired but before the key has been provided, which may never happen. Please correct me if any of that is inaccurate.

Now, what is the intended behavior? There is already a concept of blocked media element, and a media element is blocked when its readyState is less than HAVE_FUTURE_DATA. Can you just lean on that?

If you want to change the buffering policy while waiting for a key, I don't think that "suspend the resource fetch algorithm" is the right way to phrase it. What can be suspended or throttled is the network fetch itself, not the resource fetch algorithm. If taken literally, that would cause the suspend event to not be fired, because it's part of that algorithm.

The big question I have is how to think about those regions of the timeline where you have buffered the encrypted data, but you either don't have the decryption key for them yet, or you haven't even parsed it check if you will need to ask for an encryption key. From the point of view of the media element, should that look like it's not even been buffered yet, like buffered data where you may encounter a decoding error, or some other behavior that isn't currently accounted for by the HTML spec?

Happy to advise further if I get some more details of the problem.

@ddorwin
Copy link
Contributor Author

ddorwin commented Nov 18, 2015

The problem is that we're not actually talking about buffering policies or the amount of data available. In this case, the user agent cannot "advance the current playback position in the direction of playback" but not because of the data not being available as appears to be the only case considered by the HTML spec. The media data is available but the use of it (decoding and/or rendering) is blocked (because it is encrypted and the key is not available).

We don't want EME to (normatively) affect buffering and intentionally avoided making this condition appear like a network issue (by overloading network-related events). All we really want to do is suspend playback, but the HTML spec does not appear to have a "render media data" algorithm, so it's unclear exactly where to insert these steps.

Instead, it appears that rendering of media data usually occurs while the resource fetch algorithm is running (this is where, for example, corrupted media data is handled). However, it appears this could also happen outside this algorithm (see step 5 of the resource fetch algorithm as I mentioned in my previous comment). If my interpretation is correct, there is no explicit algorithm for reporting MEDIA_ERR_DECODE once the "entire resource gets loaded and kept available" even if the entire resource has not yet been decoded/played yet to determine that the media data is not corrupt. It's also unclear what happens after the resource selection algorithm is aborted by step 5 and how playback would continue.

Note: The issue and behavior should be the same for both normal .src= and MSE and should be independent of buffering. However, note that MSE modifies the resource fetch algorithm.

@foolip
Copy link
Member

foolip commented Nov 18, 2015

However, it appears this could also happen outside this algorithm (see step 5 of the resource fetch algorithm as I mentioned in my previous comment).

You're right, and this makes it hard to achieve what you want by changing the resource fetch algorithm.

If my interpretation is correct, there is no explicit algorithm for reporting MEDIA_ERR_DECODE once the "entire resource gets loaded and kept available" even if the entire resource has not yet been decoded/played yet to determine that the media data is not corrupt.

Yes, this is a bit broken, I've filed a spec issue:
whatwg/html#346

Anyway, it sounds like what you want is for the playback to stall until the decryption key is available, essentially the same behavior as if the video decoder was infinitely slow. If this is the right way to think about it, and you don't want to toggle the paused state or anything, then I think that monkey-patching blocked media element to also include a "waiting for key" condition would work. WDYT?

@mwatson2
Copy link
Contributor

I think adding a new case to "blocked media element" would work and be cleaner than the current approach.

However, when I look at the definition of readyState it is agnostic about the reason why the "current state of the element with respect to rendering the current playback position" might be what it is and the detailed state definitions only talk about availability of the data needed to render frames. In the EME case, the keys are data that are necessary to render the frames.

Anyway, we decided a long time ago not to change readyState because of key unavailability, so I don't suppose we should change this now.

@foolip
Copy link
Member

foolip commented Nov 20, 2015

I think that readyState and buffered go together, so that if at the current time there's buffered data ahead, then readyState should be at least HAVE_FUTURE_DATA. In other words, if you did want to change the readyState then you'd also have to change buffered to not contain not-yet-decrypted or possibly-encrypted ranges, which I'm sure you want even less than to muck with readyState.

I hope that the "blocked media element" idea works out, and just ping me if there's anything else!

@ghost
Copy link

ghost commented Nov 24, 2015

The "blocked media element" idea sounds promising.

I don't think we should be blocking the download of media data while we wait for a key. Most EME content will be MSE content anyway, so suspending the download won't really have any affect, right?

The buffered ranges should indicate what segments of the media data are downloaded, demuxable and potentially decodable by the media element. It should not be defined as the segments which are definitely decodable, i.e. what segments for which we have usable keys. Otherwise a player that is requesting media segments based on buffered ranges could end up re-requesting the first segments of an encrypted media repetitively while it is waiting for required keys to be marked as usable.

In Firefox's implementation, the HAVE_CURRENT_DATA and HAVE_FUTURE_DATA readyStates are dependent on whether we have enough decoded data enqueued and ready to play the at current and after the current playback position respectively.

So for EME content, we won't reach HAVE_CURRENT_DATA until the first video/audio samples are decoded, i.e. until the required keys are usable, and the CDM has used them to decode the first frame.

The Gecko media playback team has discussed on several occasions changing our implementation for HAVE_CURRENT_DATA and HAVE_FUTURE_DATA to be based on the buffered ranges. Probably we'd still want to gate HAVE_CURRENT_DATA on the first sample for all streams having been decoded, in order to avoid immediately stalling on starting playback right at the very start of the media download before the decoder has had time to produce output.

@foolip
Copy link
Member

foolip commented Nov 24, 2015

I don't think we should be blocking the download of media data while we wait for a key. Most EME content will be MSE content anyway, so suspending the download won't really have any affect, right?

@cpearce-mozilla almost certainly knows this, but to be clear, the "blocked media element" solution wouldn't affect the buffering policy at all, all it does is to make potentially playing false, which prevents playback from progressing.

@ddorwin
Copy link
Contributor Author

ddorwin commented Nov 25, 2015

Updating the definition of blocked media element sounds fine, though that may be a separate issue. The main impact of doing so appears to be that the media element would no longer be potentially playing. At a quick glance, it is not clear whether keys should or should not affect the latter definition. I defer to @foolip's expertise here.

If we do update the definition, we will know what should happen when the key is not available, but what do we do when the key becomes available? The element will no longer be a blocked media element, but there is no explicit indication that playback then resumes. I guess that is consistent with the style in which the HTML spec is written: implicitly, the element is no longer blocked so it should be (potentially) playing.

We still have the issue of where/when the Initialization Data Encountered and Encrypted Block Encountered algorithms are run. I suppose we could follow the same style as the HTML spec and just drop "during the resource fetch algorithm" so we have:

The following steps are run when the media element encounters {Initialization Data in the|a block of encrypted} media data:

We would also need to update/replace the Attempt to Resume Playback If Necessary algorithm. Rather than calling it explicitly, we would probably just say something like the following:

A media element is said to have resumed playback when ...

When the media element resumes playback, synchronously set its waiting for key value to false.


I filed #129 to resolve the readyState issue @cpearce-mozilla raised.

I'm not sure it's relevant here, but I did find that the definition of buffered ranges is different for .src= and MSE. The HTML spec says buffered "represents the ranges of the media resource, if any, that the user agent has buffered" and has nothing to do with whether those ranges are "demuxable and potentially decodable", but MSE appears to only update the data behind the range during the Coded Frame Processing algorithm, which means the data has been demuxed (and is potentially decodable?). @wolenetz, FYI.

@foolip
Copy link
Member

foolip commented Nov 25, 2015

Right, unlike a lot of other things, the influence of blocked media element is implicit and not spelled out in algorithms. Its effect is via potentially playing in "When an audio element is potentially playing, it must have its audio data played synchronised with the current playback position, at the element's effective media volume. "

In other words, modifying the definition of potentially playing directly would give the same results.

Now, this proposal only makes sense if the behavior you want is to silently block playback from progressing when waiting for a key and to silently continue one it has been provided. It would be much like waiting for a very slow decoder, from the point of view of the media element.

@mwatson2
Copy link
Contributor

Concretely, I proposed the following:
(1) Monkey-patch the definition of blocked media element to specify that the media element is blocked when waiting for key is set to true.
(2) Invoke the Initialization Data Encountered and Encrypted Block Encountered algorithms in the HTML specification's more declarative style (as suggested by @ddorwin above) rather than the present approach of inserting them into the resource fetch algorithm
(3) Modify the Encrypted Block Encountered algorithm so that when waiting for key is set to true we also remember the block we are trying to decrypt as next encrypted block.
(4) Retain the Attempt to Resume Playback If Necessary algorithm, modifying it so that explicitly runs the Encrypted Block Encountered algorithm on next encrypted block.

Does this work ?

@mwatson2
Copy link
Contributor

@foolip, @ddorwin, @cpearce-mozilla any comments on the proposal above before I invest the time creating the PR ?

@foolip
Copy link
Member

foolip commented Dec 15, 2015

I can only really comment on the media element integration, but that bit seems right.

mwatson2 added a commit to mwatson2/encrypted-media that referenced this issue Dec 15, 2015
@mwatson2
Copy link
Contributor

I created a PR: #139

@mwatson2
Copy link
Contributor

@ddorwin @cpearce-mozilla Could you review the PR I created for this one ? (PR #139)

@ddorwin
Copy link
Contributor Author

ddorwin commented Jan 23, 2016

The PR looks good overall. I left comments there. Thanks for working on this.

mwatson2 added a commit to mwatson2/encrypted-media that referenced this issue Feb 2, 2016
@mwatson2
Copy link
Contributor

mwatson2 commented Feb 2, 2016

I'll merge the PR later this week if there are no further comments.

mwatson2 added a commit to mwatson2/encrypted-media that referenced this issue Feb 3, 2016
mwatson2 added a commit to mwatson2/encrypted-media that referenced this issue Feb 3, 2016
@jdsmith3000 jdsmith3000 modified the milestone: V1 Jul 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants