Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waitingforkey as spec'd could fire at different times in different browsers #336

Closed
cpearce opened this issue Oct 17, 2016 · 17 comments
Closed

Comments

@cpearce
Copy link

cpearce commented Oct 17, 2016

The "waitingforkey" event is spec'd in "7.3.4 Wait for Key" to be fired immediately once the algorithm is called. This algorithm is called from inside the "Attempt to decrypt" algorithm. I know that both Firefox and Chrome decode ahead of the current playback position in order to smooth over any irregularities in decoding timings. However, I note that Firefox (when decoding in software) decodes 10 frames ahead but Chrome decodes 4 ahead (IIRC). So if we fire "waitingforkey" as soon as we hit a frame that we can't decode, we may end up still playing for 10 frames or so in Firefox and 4 frames in Chrome. This means we could also dispatch events such as "timeupdate" after the "waitingforkey". Confusingly, we'd be advancing the current playback position while the readyState is HAVE_METADATA.

I recall that the "waitingforkey" event is primarily as an indicator for developers that playback has stalled.

So I propose we change the "Wait for Key" algorithm to have an extra step, inserted before step 4 which enqueues a task to dispatch the event:

Wait until all decrypted and decoded media data has been played.

NOTE: As a result of this step, the "waitingforkey" event will not fire until all decryptable media data has been played, and playback has stalled.

This would enforce that browsers fire "waitingforkey" at a consistent time.

@cpearce
Copy link
Author

cpearce commented Oct 17, 2016

That is, Firefox and Chrome could end up firing "timeupdate" events after firing "waitingforkey", as we'd still be advancing the current playback position, and we won't be playing for the same duration of time as our decoded frame queues are different lengths. So we could end up firing different numbers of "timeupdate" events after the "waitingforkey" event.

@ddorwin
Copy link
Contributor

ddorwin commented Oct 17, 2016

I agree that it doesn't make sense to "be advancing the current playback position while the readyState is HAVE_METADATA." "Suspend[ing] playback" probably doesn't make sense either if there are available frames.

Perhaps we should reposition/refactor this algorithm to be based on decrypted frames being exhausted.

I'm not sure we can/want to block as suggested in this algorithm. Maybe that's okay since this is within the (effectively) decoding algorithms and thus not blocking the main thread.

/cc @foolip for comments

@ddorwin ddorwin added this to the V1NonBlocking milestone Oct 17, 2016
@ddorwin
Copy link
Contributor

ddorwin commented Oct 17, 2016

FWIW, there is also this NOTE after the text that specifies when to call the Encrypted Block Encountered algorithm:

The above step provides flexibility for user agent implementations to perform decryption at any time after an encrypted block is encountered before it is needed for playback.

@mwatson2
Copy link
Contributor

I agree a change is required to make the intended timing of the "waitingforkey" event unambiguous. Perhaps there are steps that should be executed "when media from an encrypted block is required for presentation", and these steps would check whether the block has been decrypted and fire the event / suspend playback if it has not.

@foolip
Copy link
Member

foolip commented Oct 18, 2016

Is the timing of the "waitingforkey" event already close to when playback will stall, such that it can't be used to trigger key refresh and have smooth playback? If so, how about just firing when playback has actually stalled?

@cpearce
Copy link
Author

cpearce commented Oct 19, 2016

@foolip I'd guess "waitingforkey" is close enough to the end of playable ranges that the network round trip time for a new license would be long enough that playback would be interrupted. I also expect that it's unlikely that script could figure out what key they need to request a license for without outside information, as we don't expose information about keys the CDM knows it needs.

If we tried to piggyback on the logic for firing "waiting" that's already in the HTMLMediaElement spec, we'd potentially not fire "waitingforkeys" in the case where we don't have a key for the first sample. Because we only fire the "waiting" event "if the readyState was HAVE_FUTURE_DATA or more, and the new ready state is HAVE_CURRENT_DATA or less".

@ddorwin
Copy link
Contributor

ddorwin commented Oct 19, 2016

Yes, waitingforkey is not intended to enable smooth playback. It is an indicator that something has gone wrong.

@ddorwin
Copy link
Contributor

ddorwin commented Oct 25, 2016

Is there a proposed fix for this issue?

@cpearce
Copy link
Author

cpearce commented Oct 25, 2016

We could add a new behaviour change to the HTMLMediaElement in the "The following modifications are made to the behaviour of the HTMLMediaElement" section, something like:

When the user agent cannot advance the current playback position because the CDM does not have a usable key to decrypt the media data ahead of the current playback position, or if the user seeks to a position in the media data for which the CDM does not have a usable key, the user agent SHALL run the Waiting for Key algorithm.

We could then remove the run waiting for key step in the Attempt to Decrypt algorithm.

@foolip
Copy link
Member

foolip commented Oct 25, 2016

#336 (comment) makes sense to me. Be sure to also define and test the order compared to other events that are fired when reaching the end of buffered/playable data.

@ddorwin
Copy link
Contributor

ddorwin commented Oct 25, 2016

Does @cpearce's proposal cover the initial playback case (load algorithm)? That would seem to be neither advancing (time 0 is the current time) nor seeking.

ddorwin added a commit to ddorwin/encrypted-media that referenced this issue Oct 25, 2016
@ddorwin
Copy link
Contributor

ddorwin commented Oct 25, 2016

PR #346. I have attempted to cover the initial playback case as well.

We could probably handle the two conditions via some "is blocked on a key" state, but then we'd need to reset it on seeking... which led me to realize we don't set other variables on seeking (#347).

Once we fix #347, we might be able to replace the "because the CDM does not have a key that is usable to decrypt the media data" text in the PR with "and the internal waiting for key value is true." Does that miss any cases?

@foolip, what else do we need to define regarding reaching the end of buffered/playable data? Isn't that defined somehow by the algorithms? We should never reach the point of decrypting blocks if we're at the end, right?

@ddorwin
Copy link
Contributor

ddorwin commented Oct 26, 2016

I'll create another PR that uses the waiting for key value (after fixing #347). Any concerns about that approach?

@foolip
Copy link
Member

foolip commented Oct 26, 2016

@ddorwin, you'll need use enough words so that pedantic tests can be written based on that to cover the order and timing of at least "timeupdate", "waiting" and "waitingforkey" events, and if they are fired in the same task or in separate tasks. (You can tell the difference at microtask checkpoints, most easily tested with Promise.resolve().then(/* at a microtask checkpoint */) I think.)

@foolip
Copy link
Member

foolip commented Oct 26, 2016

In other words, saying "queue a task for fire x and y" vs. "queue a task to fire x and queue a task to fire y" is observably different.

@ddorwin
Copy link
Contributor

ddorwin commented Oct 26, 2016

PR #349 is the alternate PR that uses an internal variable. I don't think reusing waiting for key makes sense since it controls whether the event is fired, so I used a new variable.

@ddorwin
Copy link
Contributor

ddorwin commented Oct 26, 2016

@foolip, I think we have been careful to be mindful of tasks, though I have not thought about the implications for all such events. Both proposed PRs probably give some leeway in when exactly the Wait for Key algorithm is run (similar to the HTML spec), but what happens after that should be clearly defined.

One item that may be of interest: The Wait for Key algorithm is not run in a task and and runs the following steps synchronously in order: sets readyState, queues a task to fire the "waitingforkey" event, then suspends playback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants