Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do with the *default* media handling 'magic' in mobile browsers? #41

Closed
richtr opened this issue May 20, 2015 · 10 comments
Closed

Comments

@richtr
Copy link
Member

richtr commented May 20, 2015

From the spec:

When playing media on the web, developers are currently forced to adopt a single default platform modality for playing all media content. On the other hand, native applications can access much richer media integration options with an underlying platform. On mobile devices, native application developers can request many different forms of media integration with the platform to obtain access to headphone buttons, lock screens and notification areas as needed. On desktop devices, native applications have access to keyboard media key events. Native application developers can specify the conditions in which media content should pause or duck on audio interruptions (i.e. pause or lower the volume for the duration of an interruption), continue playing out when application focus is lost or the device screen is switched off and interface with internal and external remote controllers.

If media sessions allow web developers to opt-in to custom platform-level media behavior on different platforms why do we insist on enforcing strict, arbitrary platform-level integration in the case that web media content has not opted-in to that?

Currently on mobile devices, by default, <audio> will continue playing out when the web browser is backgrounded and/or the device's screen is switched off. It may provide notification area controls and allow users to play and pause the audio content from the notification area. Clicking on the notification may bring the user back to the browser and, ideally, bring the tab making the noise to the foreground. It may display audio metadata on the homescreen, obtained from either the <audio> element or document metadata (such as document title and favicon). It may only allow only one media element to play out at a time or mix multiple media elements to play out at the same time. It may automatically pause <video> when the browser is backgrounded. ...or it may not do any or some of these things depending on which browser you try.

All of this behavior is a.) inconsistently provided across different web browsers, b.) completely magic in that it cannot be observed or controlled by web applications and c.) must be opted-out of (instead of having to opt-in to it in the first place) by web developers through the introduction of media sessions.

In line with the principles of the extensible web manifesto we must try to explain or remove this auto-magic behavior for default media handling by specifying how 'default' media playback should perform consistently across different web browsers and devices.

So what should we do? Describe the current magic of default media handling somehow? Choose a single sensible, consistent modality for default media handling on mobile devices (e.g. by default let's choose to treat all media as 'ambient' content)? Or should we just leave the magic of default media handling alone and not try to explain it in programmatic terms and continue to leave this up to implementors to decide what platform-level inter-operation and integration to provide by default for media content?

@foolip
Copy link
Member

foolip commented May 20, 2015

These issues have come to the forefront as a result of our trying to flesh out the programmatic API (the MediaSession interface) in the spec. Ideally the default behavior and everything one can do declaratively (e.g. kind="content") should correspond to some usage of the API.

The intended default behavior is for all media elements in a tab (top-level browsing context) to share a single media session of kind "content", so that typically only one tab can play at the same time, but elements within a tab don't interrupt each other.

The trouble is that this default media session might span origins (and processes with out-of-process iframes) so it's unlike media sessions created with new MediaSession(), and it can't be exposed to scripts without some mechanism to isolate origins.

Options considered:

  1. Let mediaElement.session==null represent the default case. This is strange, in that there is actually a media session, and the meaning of assigning mediaElement.session=null is actually to use that invisible session.
  2. Let mediaElement.session==null represent an ambient media session, since ambient is really the absence of any audio focus or key event handling.
    2a. Maintain the per-tab default, but use the same MediaSession objects for all media elements that can observe each other, so that it appears to Web content that all media elements in a tab share a session. All such default sessions within a tab would need a layer of brokering to determine which of these sessions gets to control the lock screen and gets media key events.
    2b. Change the default to a unique "content" session per element. Very simple API-wise, but would break content that deliberately plays multiple media elements together.
    2c. Change the default to ambient, i.e. no MediaSession object is created automatically. This means audio can never play in the background without explicitly using media sessions, and no notifications or lock screen UI would be shown unless sites opt in to it.

In short, we'd really like to describe the default behavior in terms of the API. Option 1 isn't a good fit, option 2b and 2c have a simple elegance, but option 2a does also seems doable.

@foolip
Copy link
Member

foolip commented May 21, 2015

A variation of option 2a I think is worth considering is to forget about the extra brokering layer, so that cross-origin iframes within a tab simply compete with each other for audio focus. In the case of YouTube or SoundCloud embedded iframes, this is the likely end result anyway, as such sites would need to create a MediaSession of kind "content" in order to customize the lock screen and notification UI.

@avayvod
Copy link
Contributor

avayvod commented May 21, 2015

I think we can't have websites opt-in into any default experience because of the backwards compatibility - opt-out is the only way if we want to improve the experience for the existing websites since forcing web developers to use the API in any way would be hard.
Currently we don't specify the default behavior leaving it free to the browsers to interpret and implement.

As I see it:

  • If the browsers have consistent behavior (without the spec), the problem described in #issue-78528763 doesn't exist.
  • If the browsers have different behavior but the web page is okay with default behavior by any browser, the problem doesn't exist either.
  • If the browsers have different behavior but the web page can specify the behavior it wants and it will work consistently in any browser supporting the API, the problem doesn't exist (the developer has means to fix it) - that's why we have to enforce behavior across browsers if we have a clear intention from web developers.
  • If the browsers have different behavior and the web page wants different behavior in different browsers and/or platforms so it has to specify different kind/session depending on the UA, we have a problem.
    However I would be interested in a real use case before taking any action.

From reading the extensible web manifesto and "Bedrock" and "Extend The Web Forward" (which I think talk more about magic that the link in the OP), I didn't get an impression that magic is the unspecified default behavior: magic is something that can't be overridden by lower level APIs (e.g. customize the border of with canvas drawing methods). The MediaSession API is exactly the lower level API that allows to override the default behavior on demand so the developer is in control so I feel our proposal is aligned well with the principles of the Web.

I'm afraid specifying the "right" default behavior for all browsers to implement will likely get ignored or even make us lose support from other browser vendors. I can imagine even the same browser having slightly different default behavior on different platforms so specifying it doesn't seem feasible.

Feel free to help me understand the concerns better, I'm still pretty new to the Web Platform world and can misread the links provided :)

Regarding expressing the default behavior via MediaSession API, isn't setting session.kind to "" equal to not specifying it at all and opting into the default behavior?

@foolip
Copy link
Member

foolip commented May 21, 2015

IMHO, the main issue isn't standardizing the default behavior (that might never happen) but that certain (sensible) default behaviors like per-tab or per-browser media sessions cannot be replicated by using an API, because the behavior spans cross-origin frames or tabs.

It would be nice if the default behavior were simply a matter of connecting HTMLMediaElement and MediaSession objects such that the desired behavior emerges. If that's not possible, there's some behavior we deem useful (like media elements in cross-origin iframes sharing a session) that's impossible for Web developers to replicate and tweak.

The downside of exposing the default media session via an API is of course a risk that some code assumes e.g. a per-element session because a popular browser does that, and it breaks if the default session is per-document or something.

@annevk
Copy link
Member

annevk commented Jul 2, 2015

How are nested browsing contexts a problem? Wouldn't we activate a session only in one?

(Some at Mozilla would like the default to be defined too.)

@foolip
Copy link
Member

foolip commented Jul 2, 2015

What would you like the default to be? If the default media session is defined and that MediaSession object can be accessed by scripts, you're now limited to default behaviors that coordinate media elements within a single document or at least groups of documents that can manipulate each other.

@annevk
Copy link
Member

annevk commented Jul 2, 2015

Paging @ehsan, @sicking, @bakulf.

@richtr richtr changed the title What to do with the default media handling 'magic' in mobile browsers? What to do with the *default* media handling 'magic' in mobile browsers? Dec 31, 2015
@jernoble
Copy link

I agree with @avayvod. There are valid reasons for not mandating a specific default behavior, not the least of which would be because even within WebKit, we would want different default behavior for mobile vs. desktop UAs.

@xxyzzzq
Copy link
Contributor

xxyzzzq commented Sep 12, 2016

Closing this issue since we are moving audio focus out to a separate API.

@xxyzzzq
Copy link
Contributor

xxyzzzq commented Oct 3, 2016

This issue was moved to WICG/audio-focus#24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants