Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does canAutoplay() need to be async? #3

Closed
cpearce opened this issue Nov 12, 2018 · 19 comments
Closed

Does canAutoplay() need to be async? #3

cpearce opened this issue Nov 12, 2018 · 19 comments

Comments

@cpearce
Copy link

cpearce commented Nov 12, 2018

Apologies if I'm raising things that were covered in person at TPAC...

Issue #1 suggests an HTMLMediaElement.canAutoplay() API, and suggests it as async. Here's the current proposal:

HTMLMediaElement.canAutoplay() returns a promise that resolves if autoplay for a specific media element is allowed or rejects with some helpful error if not:

video.canAutoplay()
.then(_ => {
  // When muted videos can autoplay, it could be either
  // because the video is muted, or the video has no audio track.
  // It would also succeed if the call is during a user activation,
  // or if the video was previously played.
  // The video must have metadata loaded (readyState > HAVE_NOTHING).
})
.catch(error => {
  // If the video metadata did not load (readyState == HAVE_NOTHING),
  // or the call did not happen on a user gesture, or, when muted videos
  // are allowed if the video is not muted and has an audio track.
  // InvalidStateError will be used if the metadata did not load.
  // NotAllowedError will be used for all other cases.
});

Why does this API needs to be async? It would be simpler if it was synchronous. What case does making this API async cover that a boolean readonly attribute on HTMLMediaElement (or a function returning a boolean) couldn't cover?

I'm guessing this was proposed as async out of desire to handle the case where the UA wants to allow media without an audio track to autoplay, but the UA can't know whether the resource has an audio track until the media load algorithm has reached readyState HAVE_METADATA?

If the media load algorithm hasn't started, should canAutoplay() return a rejected promise? I don't think it would be intuitive for calling canAutoplay() to cause the media load algorithm to start.

If the media load algorithm has started, should canAutoplay() wait until readyState >= HAVE_METADATA before resolving/rejecting?

If we don't want to await readyState reaching HAVE_METADATA before resolving the canAutoplay() promise, then I don't think canAutoplay() needs to be async.

If the load algorithm errors, we'd need to reject all outstanding canAutoplay() promises.

How long do we wait for the readyState transition to HAVE_METADATA? I've heard from some large video sites that they cannot rely on the play() promise being rejected, and I suspect this is likely due to Safari awaiting readyState HAVE_METADATA before deciding whether the play() promise should be rejected, and the network causes the load to be slow, and authors are not waiting long enough. That is, authors call play(), set a timeout and the timeout fires before the play() promise resolves.

So I think that making the behavior of canAutoplay() bound by the network may make it be considered "unreliable" by authors.

What if something that affects autoplay state changes in between canAutoplay() being called, and the promise being resolved? For example if autoplay had been allowed in a resource with an audio track because the HTMLMediaElement had the muted attribute set, but after calling canAutoplay(), script removed the muted attribute? The result would be incorrect.

If the UA only allows playbacks in a user gesture handler, and canAutoplay() was called in a user gesture handler, the UA will need to propagate along the "in a user gesture handler" state to the promise continuation. This is not a problem in itself, but we should mention it in a spec somewhere so it's behavior that authors expect. What if inside the canAutoplay() promise handler, script does a setTimeout()? Should the timer continuation there also be considered to be in a user gesture handler?

What if the canAutoplay() promise is passed as one of many promises into Promises.all()? Should the promise returned by Promises.all() also be considered as in a user gesture handler? I'd posit that "no" is a reasonable answer to this, but we should be explicit.

If canAutoplay was synchronous, much of the above wouldn't be a problem - but we would have to decide on behavior when the synchronous function/attribute was accessed before metadata had been loaded.

I think it's simpler if canAutoplay() were synchronous, but if we want it to be async, I think we need to clarify:

  • Behavior of canAutoplay() when called before media load algorithm has started.
  • Behvaior of canAutoplay() when called after media load algorithm has started but before it's reached LOADED_METADATA.
  • Behavior of canAutoplay() if state that affects whether autoplay is allowed changes after canAutoplay() is called but the promise resolves/rejects.
  • Propagation of "in a user gesture handler" state to promise/timeout continuations.
@cpearce
Copy link
Author

cpearce commented Nov 12, 2018

Also, does canAutoplay() resolves successfully mean that the media resource is of a format supported by the UA, and is it a valid/playable resource? That is, should the promise resolve be delayed until the "canplay" event has been fired?

@jernoble
Copy link

@cpearce said:

Why does this API needs to be async? It would be simpler if it was synchronous. What case does making this API async cover that a boolean readonly attribute on HTMLMediaElement (or a function returning a boolean) couldn't cover?

Without putting too many words into his mouth, at TPAC @mounirlamouri made the argument that canAutoplay() should return a Promise so that UAs could return all the normal Error codes like NotAllowedError and InvalidStateError without having to make up some new return value. IIRC, the rough spec was something like:

if readyState < HAVE_METADATA
   reject(InvalidStateError)
else if !isAllowedToPlay
   reject(NotAllowedError)
else
   resolve()

@jernoble
Copy link

To continue my thought, and to put more words into @mounirlamouri 's mouth, the incentive for making canAutoplay() promise-based was strictly about the return type, and not about async-vs-sync.

@cpearce
Copy link
Author

cpearce commented Nov 14, 2018

To continue my thought, and to put more words into @mounirlamouri 's mouth, the incentive for making canAutoplay() promise-based was strictly about the return type, and not about async-vs-sync.

If that's the case, then unfortunately there's not a nice way in WebIDL to synchronously return multiple types, particularly because returning an object may be evaluated as "true-ish".

One idea (suggested by heycam) would be to invert the semantics of the API, so it returns the DOMException that a play() call would be rejected with, or null if play() would succeed. So something like:

let error = video.wouldPlayReject();
if (!error) {
  video.play();
} else {
  // handle error, fallback etc..
}

As best I can see with simple grepping through the spec and Firefox's source, play() promise is only rejected with AbortError, SrcNotSupportedError, and NotAllowedError. I am not sure of the value of exposing the other failures types, as they're avoidable; AbortError generally happens when you pause() after a play(), and NotSupportedError can be avoided by using canPlayType().

Do we really need to expose all the errors that could happen here? How much of the media load algorithm should we run here?

@ghost
Copy link

ghost commented Nov 14, 2018

canAutoplay should not affect the state of the media element. If metadata is not loaded then we should return InvalidStateError.

One advantage is that it works well with the async play function so you can use promise chaining e.g.

video.canAutoplay()
  .then(() => video.play())
  .then(() => /* success */)
  .catch((e) => /* error */);

Here sites could catch e which would be the DOMException from either video.canAutoplay() or video.play().

@mounirlamouri
Copy link
Member

Having an async API is a better model. All browsers probably bend backward to make sure that they know whether something can autoplay synchronously. I would like to make sure we have the opportunity to remove this pain in the platform. By adding asynchronous APIs to detect autoplay, we may end up in a place where no one will rely on video.play(); if (video.paused).

On top of this, the promise offers a much nicer API than what is available with the sync alternative.

@cpearce what drives the request for a sync version?

@cpearce
Copy link
Author

cpearce commented Nov 16, 2018

canAutoplay should not affect the state of the media element.

Agreed.

One advantage is that it works well with the async play function so you can use promise chaining e.g.

video.canAutoplay()
  .then(() => video.play())
  .then(() => /* success */)
  .catch((e) => /* error */);

I think examples are a great way to discover the consequences of design choices.

Typically I see authors fallback to muted autoplay if their audible autoplay is blocked, so I think to be more representative of uses we'll see in the wild, our examples should handle being blocked and try to fallback to muted. So the example I'd give would be something like this:

      function playWithMutedPlayFallback(video) {
        return video.canAutoplay().catch((e) => {
          console.log("canAutoPlay failed with " + e.message)
          video.muted = true;
        }).then(() => video.play());
      }

      playWithMutedPlayFallback(video)
        .then(
          () => console.log("Should be playing..."),
          (e) => console.log("cannot play: " + e.message));

In the "cannot play" catch, I'd assume authors will fallback to showing a poster frame.

As a counter example, here's how I'd do that with the synchronous variant:

      function playWithMutedPlayFallback(video) {
        if (!video.allowedToPlay) {
          console.log("Not allowed to play non-muted");
          video.muted = true;
        }
        return video.play();
      }

      playWithMutedPlayFallback(video)
        .then(
          () => console.log("Should be playing..."),
          (e) => console.log("cannot play: " + e.message));

The synchronous variant has less syntax and layers of callbacks, so I think it's easier to reason about. I certainly found it easier to write; the async variant took several refactors to slim down to something so concise, and to get the error handling aligned with the behaviour I wanted.

Here are the examples live, poly-filled:
https://cpearce.github.io/html-test-cases/proposed-canautoplay-ignore-readystate.html
https://cpearce.github.io/html-test-cases/proposed-allowedToPlay-ignore-readyState.html

If you try the live examples and look at the console logging, you'll see they always fallback to muted due to the readyState always being < HAVE_METADATA when the script polls whether it is allowed to autoplay.

So irrespective of whether we make this sync or async, I'd recommend we spec the readyState>=HAVE_METADATA check to be after the other checks that report that autoplay is allowed, i.e. the psedocode for determining whether a media element is "allowed to play" would be:

if (document/HTMLMediaElement is gesture activated) {
  // allowed to play
}
if (HTMLMediaElement has muted attribute or volume == 0) {
  // allowed to play
}
// check any other conditions that allow playback...
if (HTMLMediaElement.readyState < HAVE_METADATA) {
  // not allowed to play, "invalid state".
}
if (media resource doesn't have audio track) {
 // allowed to play
}
// otherwise, not allowed to play.

This means script can always determine whether playback will be allowed due to gesture activation even in the case where metadata isn't loaded yet. They can also determine whether muted autoplay is allowed without gesture activation, even in the case where metadata isn't loaded yet.

@cpearce
Copy link
Author

cpearce commented Nov 16, 2018

Having an async API is a better model.

Promises are certainly much nicer than callbacks for operations that are asynchronous. But AFAICT, we actually don't need to do anything that's async here?

On top of this, the promise offers a much nicer API than what is available with the sync alternative.

Do you think in the examples I posted above that the async case is nicer? Can you refactor it to make it as simple as the sync example?

@cpearce what drives the request for a sync version?

Simplicity. But also as I said above, I'm wary of state changes in the media pipeline meaning the result posted to the promise callback gets out of sync with the state of the world. The canAutoplay() implementation actually being synchronous under the hood plus promise callbacks running in microtasks go a long way to helping alleviate this concern, but if canAutoplay() was actually async (that is, tasks run while decisions as to whether a media element is allowed to play or not), then readyState changes could make the results posted by this API incorrect.

@ghost
Copy link

ghost commented Nov 17, 2018

Your synchronous variant does not take into account metadata not being loaded and handling an invalid state error. When you add that it becomes much more complicated than the async variant.

@cpearce
Copy link
Author

cpearce commented Nov 18, 2018

So irrespective of whether we make this sync or async, I'd recommend we spec the readyState>=HAVE_METADATA check to be after the other checks that report that autoplay is allowed

Do you disagree with this statement?

@cpearce
Copy link
Author

cpearce commented Nov 18, 2018

Your synchronous variant does not take into account metadata not being loaded and handling an invalid state error.

This was intentional.

As I understand it, the purpose of this API is to determine whether a media element can autoplay as quickly as possible. An author may wish to play a different media (or start the download of subtitles) in the case where they can't autoplay audibly. As such, waiting for the network load required for the media element to reach readyState >= HAVE_METADATA works against that goal.

When you add that it becomes much more complicated than the async variant.

I appreciate that you're trying to make your replies as concise as possible, but it would be helpful if you could backup your asserted opinions with supporting evidence.

If you want to wait for loaded metadata, so you can get a slow-but-definitive answer first time, I'd do it like this:

      function once(target, name) {
        return new Promise(function(resolve, reject) {
          target.addEventListener(name, e => resolve(e), {once: true});
        });
      }      

      let loaded_metadata = new Promise((resolve, reject)=>{
        once(video, "loadedmetadata").then(resolve);
        once(video, "error").then(e => reject(e));
      });
      loaded_metadata.then(playWithMutedPlayFallback(...))

So that would be the same in both cases. How could this be improved?

@mounirlamouri
Copy link
Member

As I understand it, the purpose of this API is to determine whether a media element can autoplay as quickly as possible.

This is not what we were aiming for. During our meeting at TPAC, we decided that the document-level API should be used for a quick answer and the element-level API should be used for a clear answer: "can I play now?". Because metadata are key in order to answer this question, we decided that we would require this state in order to answer the question instead of having a case where the UA answers "no, but ask me again later".

I feel that we are going in circles, trying to re-do the discussions we had for 2 hours at TPAC. @padenot and @cpearce maybe you two could talk in order to clarify the situation? It may be better than trying to rehash the same arguments over here?

@cpearce
Copy link
Author

cpearce commented Nov 19, 2018

the element-level API should be used for a clear answer: "can I play now?". Because metadata are key in order to answer this question, we decided that we would require this state in order to answer the question instead of having a case where the UA answers "no, but ask me again later".

If we want to avoid having a case where the UA answers "no, but ask me again later", then we should not reject the promise with InvalidStateError if readyState has not yet reached HAVE_METADATA. That's equivalent to saying "try again later".

In order to get a definitive answer from the API as proposed in issue#1 authors would have to wait for loadedmetadata before they can safely call the API. This makes the API more likely to be incorrectly used.

So when canAutoplay() is called, and if we don't already know that playback will be allowed due to some other condition such as gesture activation, and readyState has not yet reached HAVE_METADATA, we should delay resolving the promise until metadata has been loaded.

Obviously, that would be a reason for this API to be async.

How does that sound?

@ghost
Copy link

ghost commented Nov 20, 2018

If we can answer before we have the metadata (e.g. gesture) we should resolve immediately. However, I am concerned that not returning immediately if there is no data will result in hanging promises. Since the canAutoplay function does not actually change any state there is no guarantee that the promise will ever resolve.

What about if when we come to the readyState check we resolve the promise immediately unless the element is loading. In which case we delay resolving until either the load succeeds or fails. For example:

  1. Check gesture activation => return true if activated
  2. If muted attribute or volume == 00 => return true
  3. If we have not started loading and readyState < HAVE_METADATA => "invalid state"
  4. a. If we have started loading wait for the element to finish loading
  5. a. i. If we successfully loaded metadata then continue to (4)
  6. a. ii. If we failed to load metadata => "invalid state"
  7. If the media resource does not have an audio track => return true
  8. Otherwise, not allowed to play.

Therefore developers do not need to track the loadedmetadata and error events e.g:

video.load();
video.canAutoplay()
  .then((canAutoplay) => {
    video.muted = !canAutoplay;
    return video.play();
  })
  .then(() => /* success */)
  .catch((e) => /* error */);

@mounirlamouri
Copy link
Member

the element-level API should be used for a clear answer: "can I play now?". Because metadata are key in order to answer this question, we decided that we would require this state in order to answer the question instead of having a case where the UA answers "no, but ask me again later".

If we want to avoid having a case where the UA answers "no, but ask me again later", then we should not reject the promise with InvalidStateError if readyState has not yet reached HAVE_METADATA. That's equivalent to saying "try again later".

Not quite. The API would require metadata to be available. By never giving an answer unless the information is available, we are trying to make the API behaviour predictable.

So when canAutoplay() is called, and if we don't already know that playback will be allowed due to some other condition such as gesture activation, and readyState has not yet reached HAVE_METADATA, we should delay resolving the promise until metadata has been loaded.

As @rebeccahughes pointed out, this may end up with a hanging promise. The solution to wait if metadata are being loaded is somewhere in the middle but I think it would just make the API less predictable. Requiring metadata will reduce the chances of developers shooting themselves in the foot. It may end up with situations where a race condition will allow the API to work and sometimes will reject but it's not so different from sometimes saying yes and sometimes no and much less likely. Again, when use case here is that a page will try to check if they can play something without muting it and if the API says no, they will add a muted attribute to it. In this context, waiting for metadata is not a problem. The assumption is that any use case where different resources would be picked, the document-level API would be used.

@cpearce
Copy link
Author

cpearce commented Nov 21, 2018

If we can answer before we have the metadata (e.g. gesture) we should resolve immediately.

Great!

However, I am concerned that not returning immediately if there is no data will result in hanging promises. Since the canAutoplay function does not actually change any state there is no guarantee that the promise will ever resolve.

What about if when we come to the readyState check we resolve the promise immediately unless the element is loading. In which case we delay resolving until either the load succeeds or fails.

Yes, this sounds like a good idea.

I agree we'd need to be careful to reject/resolve(false) in the cases where the load algorithm is running, but has stalled. For example, if the resource selection algorithm is waiting on a child to be appended to the HTMLMediaElement. I don't know how MSE affects the load algorithm as well as the src=url case, but there may be a similar case such as a MediaSource is attached but no source buffers have been added yet.

@cpearce
Copy link
Author

cpearce commented Nov 21, 2018

Not quite. The API would require metadata to be available. By never giving an answer unless the information is available, we are trying to make the API behaviour predictable.

The point about metadata being loaded is that it's racy; it depends on the network. This makes it unpredictable.

As @rebeccahughes pointed out, this may end up with a hanging promise.

We should reject in the cases where there would be a hanging promise, as the resource isn't loading, there's nothing to play.

The solution to wait if metadata are being loaded is somewhere in the middle but I think it would just make the API less predictable.

It seems to me that making canAutoplay() wait until metadata had loaded would make it more predictable, not less.

Without this provision, the following code would give you different results depending on the speed of the network/cache/media pipeline:

async function () {
  let v = document.createElement("video");
  v.src = "video-without-audio-track.mp4"; // Note: starts load algorithm
  let canPlay = false;
  while (!canPlay) {
    canPlay = await v.canAutoplay().then(()=>true, ()=>false);
    console.log("canPlay=" + canPlay);
  }
}();

Whereas if we awaited loading metadata this would always give the same behaviour (unless there was a network error).

Requiring metadata will reduce the chances of developers shooting themselves in the foot.

I don't think this follows. If authors have to remember to await metadata loading before they can get an authoritative answer from canAutoplay(), then mistakes will be made and authors will forget. Doing this as part of the API itself makes it easier to use correctly, and thus more predictable.

Would you as an author ever want to intentionally call this API before the metatadata had been loaded?

@ghost
Copy link

ghost commented Nov 22, 2018

Would you as an author ever want to intentionally call this API before the metatadata had been loaded?

I think the document level autoplayPolicy should be used if an author wants to determine before the metadata is loaded as that will return whether autoplay or muted autoplay is allowed. That should return a quick answer and then they can use canAutoplay can be used once the metadata has loaded and we have all the information to make a firm decision.

@alastor0325
Copy link
Collaborator

This is an old issue, which was already solved by the TAG resolution (the conclusion is to make API sync). So close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants