Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media: add method to discover whether playback is allowed before calling play() #3617

Closed
mounirlamouri opened this issue Apr 8, 2018 · 53 comments
Labels
addition/proposal New features or enhancements topic: media

Comments

@mounirlamouri
Copy link
Member

/CC @cpearce @jernoble

With autoplay policies, a media element may not be allowed to play depending on various rules: browser setting, user behaviour on the page, etc. It was suggested to expose this as a permission but because the behaviour is often dependant on past user behaviour on the page [1], it may makes sense to expose the information at a media element level. @cpearce suggested canAutoplay(). The version that returns a promise would work for me.

We could also look into making it more generic such as canPlay() or willPlay() which could be used for other things than autoplay and mostly behave like a dry run and would return the same thing as play() except that it will not start the playback. It may be a bit more painful to handle but may solve more use cases. It would also help resolve situations where autoplay is allowed but playback wouldn't work anyways. Note that one issue with reflecting autoplay for a media element is that it may depend on the availability of the metadata (available tracks for example). We would need to specify what's the expected behaviour with regards to this: should we load the metadata in order to get the information or not? I would rather load all the needed information and call this willPlay() as side effects from calling a method like this may be more expected.

[1]: Chrome allows playback after one user activation on the page; Safari allows playback after one successful playback on the page, etc.

@domenic
Copy link
Member

domenic commented Apr 8, 2018

Can you explain why a promise-based version is preferable to a boolean version? How would this require time to determine?

I guess maybe the loading-metadata concern from your second paragraph is part of it. Is there anything else worth noting about that part of the design?

@guest271314
Copy link
Contributor

guest271314 commented Apr 8, 2018

Off-topic @domenic Does this functionality not already exist at Chromium and Firefox by using flags and preferences?

Chromium/Chrome

--no-user-gesture-required Autoplay policy that does not require any user gesture.
--user-gesture-required Autoplay policy to require a user gesture in order to play.
--user-gesture-required-for-cross-origin Autoplay policy to require a user gesture in ordor to play for cross origin iframes.
--document-user-activation-required Autoplay policy that requires a document user activation
--ignore-autoplay-restrictions Ignores all autoplay restrictions. It will ignore the current autoplay policy and all restrictions such as playback in a background tab. It should only be enabled for testing.

Firefox/Nightly

media.autoplay.enabled
media.autoplay.enabled.user-gestures-needed
media.block-autoplay-until-in-foreground

(Following your policy statement for WHATWG making it clear that non-maintainers should not ask any questions in WHATWG threads to the OP regarding their issue, asked you, an WHATWG "Owner" the question. If your intent was to actually state that you do not want this user to ask any questions at your board whatsoever, then you should state that, else may be browsing these boards at any given time, and have interest in a topic or subject matter, prompting yet further questions at your board, without any regard for whether the question is "appreciated" or not).

@domenic
Copy link
Member

domenic commented Apr 8, 2018

Off-topic @guest271314 please stop hijacking other peoples' threads with your questions. These are not "boards"; this is an issue tracker where we are trying to get work done.

@guest271314
Copy link
Contributor

guest271314 commented Apr 8, 2018

Off-topic @domenic That still does not answer the questions. These are boards, no higher and mightier than any other board.

Your use of the term "hijacking" is ridiculous. You would know the difference if your board was "hijacked".

Am interested in HTML. If you do not want input or question outside of what you conceive then ban this user. Else, have questions, and shall ask them.

@domenic
Copy link
Member

domenic commented Apr 8, 2018

Off-topic @guest271314 As you persist in such hijacking and state your further intentions to continue to do so unless banned, you are now banned. I'm sorry it's come to this. Perhaps in the future you can come to understand the difference between issue trackers for getting work done, and message boards for idle discussion. Questions and input are welcome, _but not in the middle of other peoples' issue threads_.

Apologies to those in this thread trying to solve a technical problem for the above off-topic discursions; dealing with this user's pattern of behavior has been an ongoing issue for some time and it's unfortunate that it spilled over into your thread. If anyone wants to discuss those matters further whatwg/meta is the appropriate venue. Otherwise, let's get back to canAutoplay/canPlay/willPlay.

@domenic domenic added addition/proposal New features or enhancements topic: media labels Apr 8, 2018
@cpearce
Copy link

cpearce commented Apr 9, 2018

@domenic

Can you explain why a promise-based version is preferable to a boolean version? How would this require time to determine?

For Firefox, we may need to call out from our content process to our parent process in order to access a whitelist, and that should be asynchronous. We've found synchronous IPC between processes to be a significant source of UI jank, so we don't want to introduce more.

Also, if we end up having to spin up any kind of demuxing/decoding/parsing infrastructure as @mounirlamouri is suggesting above, that definitely needs to be async, as there's latency involved and our infrastructure there is async anyway.

@cpearce
Copy link

cpearce commented Apr 9, 2018

We could also look into making it more generic such as canPlay() or willPlay() which could be used for other things than autoplay and mostly behave like a dry run and would return the same thing as play() except that it will not start the playback. It may be a bit more painful to handle but may solve more use cases.

I'm wary of feature creeping here. We have one concrete use case here, what others do we have?

It would also help resolve situations where autoplay is allowed but playback wouldn't work anyways.

It has always irked me that HTMLMediaElement.canPlayType() was not async, as we may need to do disk I/O to load the system's decoders' shared libraries/DLLs from disk and test if they work, and it's pretty horrible to block JS on disk I/O.

And it has irked me that HTMLMediaElement.canPlayType() returns "maybe", but to really be sure we can play some content types on some platforms we need to actually run the decoder and produce decoded samples. We can't just parse the metadata, we need at least enough data for the first frame. So I don't think it's worthwhile going down the path of writing an API that accepts init segments and produces a result.

Also, it would be good to avoid having to download an init segment for a resource in order to tell whether the resource contains an audio track. The motivation for requesting this API was to avoid downloading media data that won't be played.

Note that one issue with reflecting autoplay for a media element is that it may depend on the availability of the metadata (available tracks for example). We would need to specify what's the expected behaviour with regards to this: should we load the metadata in order to get the information or not? I would rather load all the needed information and call this willPlay() as side effects from calling a method like this may be more expected.

Firefox's current draft block-autoplay design takes into account whether the user has interacted with the document, whether the media element is muted or volume != 0, whether the media resource has audio tracks, and whether the media element's document is already whitelisted.

So we may allow autoplay based on properties of the document hosting the media element, or properties of the media resource in question, or properties of the media element hosting the resource.

It's also worth pointing out that AFAIK all browsers that are blocking autoplay are only blocking autoplay of audible media elements. What's interesting to web authors is whether they can play audible media, as it's reasonable, at least in today's environment, to assume that they can inaudibly play media. So maybe the API could just assume it's being asked whether it can play audibly?

So assuming that, in the Firefox and Chrome implementation, the API would basically serve as reporting whether the document had been activated by a user gesture or was whitelisted. I'm not sure how well that maps to Safari's model of blocking autoplay.

@mounirlamouri
Copy link
Member Author

Regarding the audible aspect, I think we shouldn't assume it can always autoplay. Chrome Android currently has an option to always block autoplay even for muted content. We don't intend to provide this on desktop at the moment but it's something many users are asking for and I wouldn't bet that this might not happen one day in a form or another in some browser. We may want to not paint ourselves in a corner. (As a side note, Chrome currently doesn't autoplay media with no audio track, we require muted to be use but that's something we intend to fix.)

Thinking more about restricting to autoplay or not, I would rather have something generic because it will answer more questions (eg. allowed to autoplay but can't play). This said, thinking more about it, I realise that we may hit an issue because of Safari iOS' implementation: last I checked, it relied on a user gesture on the stack in order to allow playback which means that checking asynchronously if playback is allowed wouldn't work very well. We would either need something synchronous or something asynchronous that would be specific to autoplay. Alternatively, we could have something like willPlay({gesture: true}); but I might be going down the rabbit hole :)

I would also prefer to have an async method here as we have been suffering for autoplay logic being mostly sync in Blink. I'm afraid some websites may be relaying on the following behaviour:

media.play();
if (media.paused) // autoplay blocked

To stay safe, we have been sending all the information we have down to the renderer process aggressively. It will be easier to make all this async when the platform will have a better way to deal with autoplay detection. (Arguably media.play().catch() may be it but still feels like a hack.)

@jernoble
Copy link

jernoble commented Apr 9, 2018

@mounirlamouri

This said, thinking more about it, I realise that we may hit an issue because of Safari iOS' implementation: last I checked, it relied on a user gesture on the stack in order to allow playback which means that checking asynchronously if playback is allowed wouldn't work very well.

I had the same thought, and was in the middle of composing a reply to that effect, when I realized WebKit could just adapt our policy to allow playback from the resolve handler of a canAutoplay() promise. While I think a synchronous API would be fine (I don't think we'd need to do anything requiring asynchronous calls to implement canAutoplay()), we could certainly implement a promised-based API as well.

@jernoble
Copy link

jernoble commented Apr 9, 2018

@mounirlamouri

We could also look into making it more generic such as canPlay() or willPlay() which could be used for other things than autoplay and mostly behave like a dry run and would return the same thing as play() except that it will not start the playback.

That sounds a lot like load(). Why not just modify load() to return a promise?

@domenic
Copy link
Member

domenic commented Apr 9, 2018

That sounds a lot like load(). Why not just modify load() to return a promise?

Proposed by @foolip a couple years ago: #554

@jernoble
Copy link

jernoble commented Apr 9, 2018

What would canAutoplay() return for an un-muted video element whose readyState was HAVE_NOTHING? (Assuming the UA blocked autoplaying elements with audio tracks.)

  1. Return false? Returning true would definitely be incorrect, so if canAutoplay() returns a bool, then false is the only choice here.
  2. Return 'maybe'? Borrowing from canPlayType() just perpetuates the hilarity of that API, but it's arguably the correct answer.
  3. Return a Promise which resolves when the readyState reaches HAVE_METADATA? An unresolved promise as an answer has a relatively unambiguous meaning of "I can't tell you now".

I'm starting to come around to 3, but returning a Promise raises further questions: does canAutoplay() initiate the media element load algorithm? If not, does it reject with INVALID_STATE error, or does it stay unresolved until something else causes a load() to occur?

@mwatson2
Copy link

We (Netflix) would definitely like to have an API like this. Use-cases:
(a) by default we auto-play a trailer (or similar) for the currently selected content item when the user enters the site. If auto-play is disabled we'd like to download a nice banner image for that content item instead. So we would like to know before downloading any media whether auto-play will work
(b) deeplinks for Netflix content that go to a page which immediately plays the content. It's very much the user intent that the content autoplays, since that is the whole point of the deeplink. A graceful fallback for this would also be desirable (for example, we could immediately render a "play" button of some kind and buffer the content while the user works out they need to click it). I'm not sure the new API is essential for this second use-case, but it would make it simpler.

I wonder whether there could be utility in a mechanism for a site to declare that the video it wants to autoplay is the primary content of the page (as it is with a deeplink) and then be allowed to autoplay even if autoplay is otherwise disabled ? A mechanism for the user to quickly revoke this per origin in case of abuse would be needed. Maybe the browser shows an option "Didn't want this to play ? Click here." and it's a one-strike-and-you're-out for the site to be able to use this mechanism ? At a high level there should be a way that the good actors can design experiences fully aligning with user intent, whilst keeping the bad actors in check.

@jernoble
Copy link

@mwatson2

(a) by default we auto-play a trailer (or similar) for the currently selected content item when the user enters the site. If auto-play is disabled we'd like to download a nice banner image for that content item instead. So we would like to know before downloading any media whether auto-play will work

I don't know that it's going to be possible to implement a canAutoplay() which works in the absence of media data. Safari and Firefox have exceptions to their autoplay blocking for media without audio tracks (and IIRC, Chrome is moving this direction as well).

NB: Netflix could just serve their animated banners as muted-by-default. This "feature" is an example of the kind of bad website behavior that is a primary reason for blocking automatic playback of audible content.

That said, wouldn't something like this solve that use-case?: <video src="url" poster="banner" preload="metadata" autoplay>. IOW, autoplay, but if you can't, don't load any additional media data and display the "rich banner image" instead.

@mwatson2
Copy link

@jernoble Is there any reason that autoplay permission would be dependent on any aspect of the media other than whether there is audio or not ? Seems unlikely. If not, then the site could be allowed to discover whether autoplay with audio is allowed as well as whether silent autoplay is allowed.

The content preview on the Netflix website plays as the single largest element of the page, taking the full width and most of the height of the page. It's the main event when you visit the page. We found that providing these previews gives users a better idea of the content to inform their choice of what to play, compared to still images. If autoplay is not appropriate for this kind of use, I'm not sure what it is for.

I don't want to make assumptions about the alternative behavior when autoplay (with or without audio) is disabled. I'd just like to know what I can do and then I can design a good experience for that case.

So, for example, I wouldn't want to assume that the fallback is just to display a poster image in the video element. Fetching metadata (both the manifest for the media streams and the media headers from the streams, takes time and resource). I'd just like to be able to branch as soon as possible based on what the site is allowed to do and deliver a good experience according to the user preferences.

@cpearce
Copy link

cpearce commented Apr 12, 2018

I don't know that it's going to be possible to implement a canAutoplay() which works in the absence of media data.

Authors can today cheaply detect whether they can autoplay by loading a base64 encoded video into an HTMLMediaElement as a data URI src and calling play() and observing the promise accept/reject, for example:
https://bug1442186.bmoattachments.org/attachment.cgi?id=8955074
(based on https://github.com/video-dev/can-autoplay )

If the audio track was all 0 samples, the video would be inaudible, but still blocked as if it was audible, at least in today's implementations.

If we can't figure out how to do this without requiring a loaded media element, then we're probably not better than the can-autoplay JS library, and so we probably shouldn't bother.

@jernoble
Copy link

@cpearce Except that perturbs the can autoplay flag. What @mounirlamouri is proposing wouldn't affect the state of the <video> element at all, which is not possible by checking the promise returned by play().

@cpearce
Copy link

cpearce commented Apr 12, 2018

@jernoble Sure. I had assumed authors would create the video element solely for the purpose of determining whether they could autoplay in general, and dispose of it once they're done.

My point was, there's already exists a way to determine that we can autoplay videos with an audio track, with only a small download overhead (the base64 encoded video). So it seems better to try to spec something that doesn't require a loaded media element, as otherwise we're not resolving the complaint which is authors want to determine whether they can autoplay without having to download media data.

@jernoble
Copy link

@cpearce

otherwise we're not resolving the complaint which is authors want to determine whether they can autoplay without having to download media data.

Well that's certainly one use case. But the general case is whether this video can autoplay, not whether videos in general can, because authors are probably more interested in playing specific videos rather than the philosophical question of whether any can. And this technique (checking the play() promise) relies on a convention (that once any video plays, the rest can too) that is not by any means fixed. It doesn't work on Mobile Safari, for example.

@cpearce
Copy link

cpearce commented Jun 20, 2018

So trying to bring together the threads we have here and make some progress...

Use cases (paraphrased from @mwatson2's cases above, I've heard both of these cases from people other than Netflix as well FWIW):

a. Author has a video loading or loaded in an HTMLMediaElement, if JS calls play(), will it play?
b. Author wants to know "if I pay the price of downloading a media and try to play it, will it play?".

Proposal: HTMLMediaElement.canAutoplay() returns a promise, which resolves with undefined if this media element can be expected to autoplay given its current state, rejected otherwise.

The UA can check whether the media element's muted attribute is true to consider audibleness. So script can call canAutoplay() on a media element with the muted attribute set to true to determine whether the UA supports inaudible autoplay, or without the muted attribute to determine whether audible autoplay is supported.

If the UA wants to block autoplay of unmuted media elements which have audio tracks, and a a load is in progress, it can optionally wait until the media element has loaded metadata before fulfilling the play() promise, so it can check for the presence of an audio track.

I discussed with my colleagues here what to do in the case where the load algorithm hasn't been started yet. We propose that for expediency, canAutoplay() should assume worst case for all unknowns, i.e. if the UA blocks autoplay for media with an audio track and there's no load in progress, canAutoplay() should assume that the media has an audio track (and authors can set the HTMLMediaElement's muted attribute to true to express in advance that they want inaudible autoplay). If the UA prompts for user approval to autoplay, and there's no stored permission, the UA should assume that permission won't be granted.

This means that authors which want to avoid paying the price for downloading media which won't be able to autoplay can do so without having to wait for a dummy video to load into the media element, and this won't perturb the state of the media element by requiring the load of a dummy resource.

@cpearce
Copy link

cpearce commented Jun 27, 2018

We are keen to implement this feature before we ship our block autoplay, so does anyone have any further comments here?

Now that our implementation is further along, I believe that our implementation of canAutoplay could be synchronous, i.e. we could make this:

readonly attribute boolean canAutoplay;

The advantage here is that calling script is simpler. The disadvantage of this approach is it means implementations can't await readyState>=HAVE_METADATA before responding.

Note we chose to not take into account whether a media has audio tracks in our blocking logic, and rely on the muted attribute instead. So we don't need to wait for readyState>=HAVE_METADATA. But I'm happy to make this async if it makes it easier for other implementations to make it work with their blocking logic, and to leave scope for future blocking logic changes which may require it to be async.

@mwatson2
Copy link

mwatson2 commented Jun 28, 2018 via email

@jernoble
Copy link

@mwatson2, speaking for Safari, it’s a variant of (b). If the .muted property is set to false within a user gesture event handler, playback will continue normally. If .muted is set to false outside a user gesture event handler, playback will pause (assuming playback began without a user gesture).

@jernoble
Copy link

@cpearce Safari has no requirement for this to be an async API. We would probably settle on returning false from this API if .muted was false and the readyState was HAVE_NOTHING.

@cpearce
Copy link

cpearce commented Jun 28, 2018

@mwatson2 when the volume or muted attribute changes, our current implementation checks to see whether that media element is allowed to autoplay audibly, and if not, it pauses the media element.

So unmuting an autoplaying video which had an audio track in a user gesture handler would not pause the video. If script set muted to false and the document had not had any user interaction, we'd pause the video.

@cpearce
Copy link

cpearce commented Jul 2, 2018

@mounirlamouri Do you have an opinion on #3617 (comment)?

Perhaps we should just call it "allowedToPlay" instead of "canAutoplay". So we could make this:

readonly attribute boolean allowedToPlay;

Then it matches the exception name that the play() promise is rejected with when playback is blocked.

It occurred to me that we should have something similar for WebAudio.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Aug 2, 2018
Various web authors have expressed desire to know in advance whether autoplay
will work.

They want this in order to avoid paying the price for downloading media that
won't play. Or they want to take other action such as showing a poster image
instead.

This is of particular interest to Firefox, as we're planning on showing a
prompt to ask the user whether they would like a site to play. If sites want to
determine whether they can autoplay but avoid the prompt showing, they won't be
able to just call play() in Firefox and see whether it works, as that would
likely show the prompt if the user doesn't already have a stored permission.

We've been working out a spec here:
whatwg/html#3617 (comment)

This implements what is the consensus to date there;
HTMLMediaElement.allowedToPlay, which returns true when a play() call would not
be blocked with NotAllowedError by autoplay blocking policies.

MozReview-Commit-ID: AkBu0G7uCJ0

--HG--
extra : rebase_source : 3f31db79aa1e570fdd9fc7062d0ddac7c96a8931
jankeromnes pushed a commit to jankeromnes/gecko that referenced this issue Aug 2, 2018
Various web authors have expressed desire to know in advance whether autoplay
will work.

They want this in order to avoid paying the price for downloading media that
won't play. Or they want to take other action such as showing a poster image
instead.

This is of particular interest to Firefox, as we're planning on showing a
prompt to ask the user whether they would like a site to play. If sites want to
determine whether they can autoplay but avoid the prompt showing, they won't be
able to just call play() in Firefox and see whether it works, as that would
likely show the prompt if the user doesn't already have a stored permission.

We've been working out a spec here:
whatwg/html#3617 (comment)

This implements what is the consensus to date there;
HTMLMediaElement.allowedToPlay, which returns true when a play() call would not
be blocked with NotAllowedError by autoplay blocking policies.

MozReview-Commit-ID: AkBu0G7uCJ0
@mounirlamouri
Copy link
Member Author

If we go with a synchronous call that will therefore only use the muted attribute (unless metadata are loaded) and given the need for Web Audio and now Web Speech APIs, would it make sense to instead move this boolean to the Document and have it reflect whether audible playback is currently allowed? I think all UAs currently have different ways of defining when audible playback is allowed as in Firefox has a prompt mechanism/permission, Safari has a whitelist/permission and Chrome has the Media Engagement mechanism. Though, this boolean could reflect all of these.

The downside is that one wouldn't be allowed to know if a given HTMLVideoElement is allowed to play in some configuration but this would most likely only be a problem for muted autoplay. There are small differences between UA around this but maybe we should work towards converging these implementations instead of exposing an API to expose the differences?

WDYT?

@jernoble
Copy link

@mounirlamouri Mac Safari allows users to chose to block all media playback, audible or not. Your proposal would not allow pages to detect this state, but the .allowedToPlay() proposal would.

@mounirlamouri
Copy link
Member Author

Could we handle this by not having a boolean but maybe an enum? none or something similar would reflect that no playback is allowed while muted would represent that only muted playback is allowed. WDYT?

@cpearce
Copy link

cpearce commented Aug 19, 2018

Returning a simple enum value doesn't tell the whole story; there could be multiple ways that playback could be achieved. For example, as well as muted playback being allowed, playback may (or may not) be allowed in a user gesture handler, and Firefox could play audibly if the site tries and the user approves the play via our prompt.

If we wanted an enum based solution, I think we'd need to return a list of enum values describing all the ways playback could be achieved.

So we have something more concrete to compare with, I think it would need to look something like:

enum AutoplayPolicy {
  "allowed", // Audible playback allowed. Mutually exclusive with other values.
  "muted", // Playback allowed for muted elements.
  "gesture", // Audible playback allowed in/after user gesture.
  "prompt", // If attempted, audible playback may be approved or denied by user via prompt.
  "disabled", // No playback allowed under any conditions. Mutually exclusive with other values.
}

partial interface HTMLMediaElement {
  sequence<AutoplayPolicy> autoplayPolicies();
}

So script could execute: document.createElement(“video”).autoplayPolicies().includes("allowed") to tell whether they could autoplay audibly unconditionally, and then inspect the other policy enum values after that if desired.

Firefox for example would return ["muted", "gesture", "prompt"] for a non-muted media element on non-whitelisted sites which hadn't yet had user interaction.

["allowed"] would be returned if the user had previously authorized that site to autoplay, or it was whitelisted, or if the site had already had user interaction, or in Chrome's case if the site had passed the MEI threshold. If the user had previously denied permission to autoplay, or Chrome's MEI threshold wasn't passed, ["muted", "gesture"] would be returned.

A complication here is what if a UA wanted a user gesture to be required for inaudible playback at some point? Then we'd need "gesture-muted". Or if we wanted to add some other un-block condition here, the enum would need to grow.

Using it would look like:

let policies = video.autoplayPolicies();
if (policies.includes("allowed") || policies.includes("prompt")) {
  video.play().catch(e => {
    // user denied permission to play via prompt...
    // Wait for a click on play button...
  });
} else if (policies.includes("muted")) {
  video.muted = true;
  video.play();
} else if (policies.includes("gesture")) {
  // Wait for click on play button...
} else {
  // Give up! Show an image?
}

Compared to the boolean variant:

if (video.allowedToPlay) {
  video.play();
} else {
  video.muted = true;
  if (video.allowedToPlay) {
    video.play();
  } else {
    // Inaudible autoplay disabled.
    // Give up? Wait for play button click?
  }
}

The advantages of the boolean variant here I think is that it's simple and is somewhat future proofed. It's easy to use with UAs' current block autoplay policies; most sites out there today which are sniffing to see if they can autoplay try to play something non-muted, then fallback to playing muted, and in most cases that works or they (have to) give up.

The advantage of the enum-list approach is it tells script exactly what it needs to do to be able to autoplay. With the boolean variant, script needs to try combinations of attributes/conditions to see if they work.

The enum variant allows the UA to express that media is playable in/after a gesture handler, or if media is disabled altogether (does Safari really disable all media, or just inaudible autoplay?). The boolean variant implies that playback would be allowed in/after a user gesture handler, but can't explicitly express that all media playback is disabled, though it can express inaudible autoplay disabled.

@mounirlamouri
Copy link
Member Author

You seem to add one requirement in the document-level API that is exposing what would be needed to change state while the element-level API does not have this requirement. Why?

The main tangible difference between both APIs is that at an element-level we do not need to say that autoplay is allowed muted or not because the information may already be available when calling the API. However, as mentioned, some UAs may or may not block muted autoplay and some other APIs would benefit from this too. I think that it's worth moving the API to be more generic. However, I don't think we should expose to the website what mechanism the UA is using to allow playback as this UA specific and will inevitably change over time.

@cpearce
Copy link

cpearce commented Aug 20, 2018

The main tangible difference between both APIs is that at an element-level we do not need to say that autoplay is allowed muted or not because the information may already be available when calling the API.

So you agree that having HTMLMediaElement return a enum {"allowed", "blocked", "allowed-muted"} is unnecessary, as the boolean variant is sufficient when combined with script being able to set the HTMLMediaElement's muted attribute?

However, as mentioned, some UAs may or may not block muted autoplay and some other APIs would benefit from this too. I think that it's worth moving the API to be more generic.

I think having a generic API on Document would not work in the Safari case as they have per-element state that affects whether an HTMLMediaElement is allowed to play. Which is why the autoplayPolicies() proposal was on HTMLMediaElement, and not on Document.

You seem to add one requirement in the document-level API that is exposing what would be needed to change state while the element-level API does not have this requirement. Why?

Since Safari's logic to allow autoplay relies on per-element state, I think the only way to make a generic Document-level API which covers all UAs would be to list the conditions under which playback was allowed in a Document.

However, I don't think we should expose to the website what mechanism the UA is using to allow playback as this UA specific and will inevitably change over time.

The boolean HTMLMediaElement.allowedToPlay variant doesn't expose the underlying mechanism the UA uses to allow playback. :-)

I prefer the boolean variant due to its simplicity. Do you have a counter-proposal that works across UAs?

I think other Web APIs should have something similar, but be tailored to how those APIs work. Here the boolean allowedToPlay variant is proposed to be true when the play() promise would not be rejected with NotAllowedError, which makes sense in the context of the HTMLMediaElement API. Other APIs should have something that matches how blocking is expressed for those APIs.

@mounirlamouri
Copy link
Member Author

Safari macOS does not require a user gesture for each media element but activating one media element would activate them all. In other words, after activating one media element, the page status would go from allowed-muted to allowed. While on Safari iOS, one could imagine the status to stick to allowed-muted forever. I do not think the document-level API would be incompatible with Safari's implementation. Though, maybe @jernoble has a different point of view.

I think having an API only exposed on HTMLMediaElement is not future facing. Blink is already blocking the Speech API because it was abused for autoplay and Web Audio will obviously also be blocked. I would expect all UAs to implement a central system to decide about playback so it would be a bit weird if all APIs had to expose individually what is the autoplay status.

@cpearce
Copy link

cpearce commented Aug 23, 2018

Safari macOS does not require a user gesture for each media element but activating one media element would activate them all.

I tested this in Safari, and you're indeed correct. Thanks for pointing that out.

I think having an API only exposed on HTMLMediaElement is not future facing.

"allow-muted" makes sense in the case of HTMLMediaElement which are explicitly mutable via attributes (though maybe it should be "allow-inaudible"?) but does that make sense in the case of WebAudio or WebSpeech?

I suppose for WebAudio you could assume that a graph which isn't connected to a destination node is inaudible/muted, and so allowed to run? I'm not sure what makes sense for WebSpeech.

So we're clear, you're proposing something like:

enum AutoplayPolicy {
  "blocked", // No automatic playback allowed.
  "allowed",  // automatic playback allowed.
  "inaudible", // automatic playback of inaudible media allowed.
};

partial interface Document {
  readonly attribute AutoplayPolicy autoplayPolicy;
};

@jernoble
Copy link

iOS Safari has a per-element restriction, not per-document, so this Document based API is still a non-starter for us.

I understand that because Google's autoplay restriction is Document-based, they would prefer a Document-based API. And since (iOS) Safari's autoplay restriction is Element-based, we would prefer an Element-based API. But an Element-based API is implementable even in a Document-based restriction model, and not vice versa.

@cpearce
Copy link

cpearce commented Sep 11, 2018

@mounirlamouri do you have a concrete counter-proposal that can work in all major browsers? I would like to proceed with the boolean HTMLMediaElement.allowedToPlay variant as it can work across browsers.

@foolip
Copy link
Member

foolip commented Sep 18, 2018

@dtapuska, does #4009 provide a way to do this as well?

@dtapuska
Copy link
Contributor

navigator.userActivation.hasBeenActive gives some idea if there is a user gesture needed tor autoplay but I believe there are other things that gate autoplay like feature policies etc.

@cpearce
Copy link

cpearce commented Sep 25, 2018

Based on Chromium's commit logs it looks like @mounirlamouri is no longer working on block autoplay. Other people at Google seem to think the boolean HTMLMediaElement.allowedToPlay proposal is acceptable, shall I make a spec PR for this?

@mounirlamouri
Copy link
Member Author

Well, I still work on this @cpearce :) I do not think the allowedToPlay solution checks all the requirements we have and it's mostly a band-aid for a short term problem. What are the use cases that the document-based solution wouldn't solve?

@jernoble
Copy link

The Document-based model is not implementable on iOS Safari, as the restriction is not Document-based there.

@cpearce
Copy link

cpearce commented Sep 26, 2018

@mounirlamouri I am happy to hear you're still working on this. I would like to find a solution that is acceptable to all major browsers. To move forward I think it would be helpful to be explicit about what the requirements are. What do you perceive the requirements to be?

@mwatson2
Copy link

mwatson2 commented Sep 27, 2018 via email

@jernoble
Copy link

jernoble commented Sep 27, 2018

@mwatson2, I think this framing is incomplete, because (for Safari at least, and possibly Firefox) there are three possibilities, not two.

  • Allow audible auto play
  • Allow inaudible auto play
  • Allow no auto play

Depending on user choice. (There are a vocal minority who prefer the latter.) Any API that only distinguishes between the first two states will fail the third group.

Consider the use case of deciding whether to provide a “play button” affordance in an autoplaying video:

video.addEventListener(‘canplaythrough’, event => {
    if(!video.allowedToPlay)
        video.muted = true;
    if(!video.allowedToPlay)
        playButton.className = ‘show’;
}, { once: true });

How would a “automute autoplay” state solve this use case?

@mounirlamouri
Copy link
Member Author

@jernoble it sounds that returning three different values instead of a boolean may help so it would avoid websites testing if autoplay is allowed after setting the muted attribute.

Chrome did consider automute but we waived it as it got very bad reception from some developers with socialise the idea with and we were expecting the experience to be likely more confusing to users.

@jernoble what are your intentions with regards to Web Audio and allowedToPlay. Would you recommend exposing the exact same property to the AudioContext? If this is less of a priority, could a compromise be to have an autoplay status API exposed on the document and, when autoplay is element-specific, also on the HTMLMediaElement?

@jernoble
Copy link

jernoble commented Oct 4, 2018

@mounirlamouri

it sounds that returning three different values instead of a boolean may help so it would avoid websites testing if autoplay is allowed after setting the muted attribute.

Either the element is allowed to play, or it's not. There is no third state. No additional complexity is needed to implement muted checking for pages which want to check it.

Would you recommend exposing the exact same property to the AudioContext?

Web Audio cannot "autoplay" like a media element can, nor can it be "muted" like media elements, and it already has AudioContext.state and a promise-returning resume() function. The task of determining whether an AudioContext is allowed to play is much simpler than a media element.

We could ask for .allowedToPlay to be added to AudioContext, but the Web Audio authors may prefer to expose that as a new enum value on their existing state, rather than a new property.

@mounirlamouri
Copy link
Member Author

We discussed this at TPAC (see w3c/autoplay#1) and we are going to move the discussions over the autoplay repo.

gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 3, 2019
Various web authors have expressed desire to know in advance whether autoplay
will work.

They want this in order to avoid paying the price for downloading media that
won't play. Or they want to take other action such as showing a poster image
instead.

This is of particular interest to Firefox, as we're planning on showing a
prompt to ask the user whether they would like a site to play. If sites want to
determine whether they can autoplay but avoid the prompt showing, they won't be
able to just call play() in Firefox and see whether it works, as that would
likely show the prompt if the user doesn't already have a stored permission.

We've been working out a spec here:
whatwg/html#3617 (comment)

This implements what is the consensus to date there;
HTMLMediaElement.allowedToPlay, which returns true when a play() call would not
be blocked with NotAllowedError by autoplay blocking policies.

MozReview-Commit-ID: AkBu0G7uCJ0

UltraBlame original commit: f5f55e363d318effdfd504061c50f87e06301c6f
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 3, 2019
Various web authors have expressed desire to know in advance whether autoplay
will work.

They want this in order to avoid paying the price for downloading media that
won't play. Or they want to take other action such as showing a poster image
instead.

This is of particular interest to Firefox, as we're planning on showing a
prompt to ask the user whether they would like a site to play. If sites want to
determine whether they can autoplay but avoid the prompt showing, they won't be
able to just call play() in Firefox and see whether it works, as that would
likely show the prompt if the user doesn't already have a stored permission.

We've been working out a spec here:
whatwg/html#3617 (comment)

This implements what is the consensus to date there;
HTMLMediaElement.allowedToPlay, which returns true when a play() call would not
be blocked with NotAllowedError by autoplay blocking policies.

MozReview-Commit-ID: AkBu0G7uCJ0

UltraBlame original commit: f5f55e363d318effdfd504061c50f87e06301c6f
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 3, 2019
Various web authors have expressed desire to know in advance whether autoplay
will work.

They want this in order to avoid paying the price for downloading media that
won't play. Or they want to take other action such as showing a poster image
instead.

This is of particular interest to Firefox, as we're planning on showing a
prompt to ask the user whether they would like a site to play. If sites want to
determine whether they can autoplay but avoid the prompt showing, they won't be
able to just call play() in Firefox and see whether it works, as that would
likely show the prompt if the user doesn't already have a stored permission.

We've been working out a spec here:
whatwg/html#3617 (comment)

This implements what is the consensus to date there;
HTMLMediaElement.allowedToPlay, which returns true when a play() call would not
be blocked with NotAllowedError by autoplay blocking policies.

MozReview-Commit-ID: AkBu0G7uCJ0

UltraBlame original commit: f5f55e363d318effdfd504061c50f87e06301c6f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: media
Development

No branches or pull requests

9 participants