Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream Type - Proposal #3

Closed
cjpillsbury opened this issue Apr 19, 2022 · 27 comments · Fixed by #10
Closed

Stream Type - Proposal #3

cjpillsbury opened this issue Apr 19, 2022 · 27 comments · Fixed by #10

Comments

@cjpillsbury
Copy link
Collaborator

cjpillsbury commented Apr 19, 2022

Overview & Purpose

The idea of different “stream types” has been around for a long time in various HTTP Adaptive Streaming (HAS) standards and its precursors in some manner - minimally distinguishing between “live” content and “video on demand” content. However, these categories aren’t consistently named or distinguished in the same way across the various specifications. Moreover, there is no corresponding API in the browser. Yet these categories directly inform how one expects users to consume and interact with the media, including what sort of UI or “chrome” should be made available for the user. By way of example, the built in controls/UI in Safari that show up for a live src are different than those that show up for a VOD src. This proposal aims to normalize the names and definitions of StreamTypes (in a way that is extensible and evolvable over time) by way of how they are expected to be consumed and interacted with by a viewer/user. It also provides a concise and easy to understand differentiator for anyone implementing different UIs/controls/"chromes" for the various stream types.

An additional goal of this proposal is to recommend for MSE-based players or “playback engines” to try to normalize their use of existing APIs to be as consistent as possible with the proposed inferred StreamType Algorithm.

Proposed StreamType Types & Definitions

  • "unknown" (default) - There is no media content or there is currently insufficient information to determine the StreamType of the current media content (e.g. metadata or similar is still loading, async default StreamType inference not yet done)
  • "vod" (“Video on Demand”) - The media content has a known start and end time and is intended to be randomly seekable from start to end as long as the content is available at all
  • "live" - The media content is intended to be viewed at the “live edge” as forward/subsequent content is made available over time and is not intended to be seekable at all
  • "dvr" - The media content has a known start time and by default is intended to be viewed at the “live edge” as forward content is made available over time, but all backward/previous content is also available for seeking from start to the current “live edge”
  • (Future?) "sliding" (“Sliding Window”, “Partial DVR”) - The media content is by default intended to be viewed at the “live edge” as forward content is made available over time, but is also intended to be seekable within a (roughly) consistent time window relative to the current “live edge”
  • Others?

Proposed Interface

  • type StreamType = "unknown" | "vod" | "live" | "dvr" (| "sliding"?) (| string?)
  • HTMLMediaElement::get streamType() {} : StreamType
    • Will use Inferred stream type if no streamType is set. See below for algorithm
  • HTMLMediaElement::set streamType() {}
    • Intended to override inferred stream type
  • Event Types: streamtypechange
    • Should be fired whenever streamType changes (inferred or explicitly set)

Proposed Stream Type Inferring (overridable)

Algorithm (Pseudo-code):

  1. Let StreamType = "unknown"
  2. If mediaEl.duration === NaN (exit)
    • Aka StreamType = "unknown"
  3. If mediaEl.duration !== Infinity, StreamType = "vod" (exit)
    • Stricter: If mediaEl.seekable.end(0) === mediaEl.duration (or Math.abs(mediaEl.duration - mediaEl.seekable.end(0)) <= MOE for precision considerations)
  4. If media.duration === Infinity
    • Stricter: If mediaEl.seekable.end(0) < mediaEl.duration (or (mediaEl.duration - mediaEl.seekable.end(0)) > MOE for precision considerations)
    1. Let ChunkDuration = the presumed longest duration, in seconds, of a media chunk/segment
    2. Let SeekableStart0 = mediaEl.seekable.start(0)
    3. Let SeekableEnd0 = mediaEl.seekable.end(0)
    4. Wait ChunkDuration
    5. Let SeekableEnd1 = mediaEl.seekable.end(0)
    6. Let SeekableStart1 = mediaEl.seekable.start(0)
    7. If SeekableEnd1 === SeekableEnd0 (or Math.abs(SeekableEnd1 - SeekableEnd0) <= MOE for precision considerations), GOTO (iv)
    8. If SeekableStart1 === SeekableStart0 (or Math.abs(SeekableStart1 - SeekableStart0) <= MOE for precision considerations), StreamType = "dvr" (exit)
    9. If SeekableStart1 > SeekableStart0 (or Math.abs(SeekableStart1 - SeekableStart0) > MOE for precision considerations), StreamType = "live" (exit)
      • NOTE: This doesn’t account for/differentiate “sliding” StreamType
  5. (exit)
    • Aka StreamType = "unknown"

Additional Considerations

  • At the very beginning of a live stream, the algorithm above may misidentify or fail to disambiguate between "dvr" and "live"
    This algorithm should be re-applied/computed whenever the dependent variables may change
  • At the very end of a "live"/"dvr" stream, the computed stream type could change to "vod" based on the currently proposed algorithm.

Related Standards/Specs Definitions

Distinguishing/Categorizing Types

RFC 8216 (“HLS”)

ISO/IEC 23009-1 (“MPEG-DASH”)

  • Explicit - static (“vod”), dynamic (“dvr” or “live” - cannot differentiate by attr)
    • Defined by MPD@type attribute value (§5.3.1.2, Table 3 — Semantics of MPD element)
  • Implicit - “dvr”
    • MPD@timeShiftBufferDepth (§5.3.1.2, Table 3 — Semantics of MPD element) grows consistently with the available Segments & wall clock time and has a consistent computed start time (similar to inferred algorithm for "dvr")

Duration for "live"/"dvr"

Seekable Range for "dvr"

@gkatsev
Copy link
Member

gkatsev commented Apr 19, 2022

I've always had a hard time trying to talk about DVR vs sliding window DVR and what not. So, having an agreed set of names will definitely simplify things.

Some questions/comments:
What does MOE stand for in the algorithm?

Why not account for the HLS and DASH properties in the algorithm? Could add something like the following as step 2:

If the media provides a stream time [see HLS, DASH], set streamType to the provided value. End algorithm.

Is the main difference between live and sliding window DVR is at what threshold should you start showing the dvr-like controls? Maybe it doesn't need an official stream type, as different users likely have a different tolerance of the threshold to bring up the controls.

@cjpillsbury
Copy link
Collaborator Author

cjpillsbury commented Apr 27, 2022

What does MOE stand for in the algorithm?

"margin of error". It would be a constant. I can update the algorithm to define the term and loosely define the value.

Why not account for the HLS and DASH properties in the algorithm?

I was hoping to avoid scope creep since how these values are provided and what information you have available will vary. E.g. For HLS, if you're using "native browser" playback, the algorithm is effectively the same since you don't have direct access to the playlists. For MPEG-DASH, there is only "static" (-> "vod") vs. "dynamic" (-> "live" | "dvr" | "sliding").

That said, it may be worth at least having some discussion on how these would be inferred from the manifests/playlists and how they change over time?

Is the main difference between live and sliding window DVR is at what threshold should you start showing the dvr-like controls?

It could also impact the "type" of UI you'd want to present, specifically around seeking. "sliding" is somewhere in between a "live" experience and a "dvr" experience, since the start time is "moving under foot" so designers may want to account for that. Theoretically, they could both fall into the category of "DVR" (which is one reason I kept it as a potential "future" type), but most designs that conflate the two are particularly bad for "sliding" (bracketing the clunkiness of most "dvr" designs that lack a known/estimated duration).

I think for now we can likely pretend the distinction doesn't exist and treat it out of scope, but with the current direction of this proposal, thinking about either an extensible/customizable set of possible stream types or, at the very least, a set that can change over time would be good when trying to think through risky assumptions in v1.

@heff
Copy link
Member

heff commented Nov 18, 2022

Could this also be represented as:

StreamType:

  • null (duration = NaN)
  • live (duration = Infinity)
  • vod (duration = other number)

DVRWindow:

  • NaN/Undefined (default, no window, previously "live")
  • Infinity (previously "dvr")
  • Some other number ("sliding")

I think it would be easier on the UI if you can build a general Live UI, without having to know all the live types.

@cjpillsbury
Copy link
Collaborator Author

I think this looks great as an alternative. Let's assume we'll move forward with this. A few callouts on the details:

  • Let's use "on-demand" instead of "vod" for the value (tl;dr - make it less "jargony" and don't implicitly assume video)
  • For the values, let's assume the "nil" cases are either
    • strict and always undefined (either the explicit value undefined or literally not defined)
    • loose and defined as "anything that doesn't fall into the previous categories" (e.g. !("live" || "on-demand") for StreamType, not positive number for DVRWindow)
    • loose and defined as "anything nil-like" (NaN || null || undefined)
  • Property names are dvrWindow and streamType on a(n extended) media element.
    • Not sure these should be part of the media-ui-extensions formalization, but attributes could be either
      • stream-type & dvr-window (more consistent with media chrome & Open Elements naming conventions; higher legibility)
      • streamtype & dvrwindow (more consistent with generic HTML attr naming conventions; lower legibility)
  • Properties are (can be?) inferred based on the media content but are overridable via a setter (aka not read only)

@cjpillsbury
Copy link
Collaborator Author

Just to call this out explicitly (discussed out of band):

  1. With this new proposal, it's possible to have e.g. streamType="on-demand" && dvrWindow=30, which is "technically invalid".
  2. Since this is for Media UI Extensions, this may actually be a strength, allowing folks to treat on demand content "as if" it's e.g. a sliding window.

@gkatsev
Copy link
Member

gkatsev commented Nov 18, 2022

I do like having the two properties, since DVR window is a description of live to me.
DVR Window could also be defined as only valid if the stream type is live, but might make sense to keep it loose and allow for vod to be treated as live/dvr depending on what's set.

Might make sense to have a "minimal" support and "maximal" support.

@gkatsev
Copy link
Member

gkatsev commented Nov 18, 2022

Thinking about it some more, I think that having the dvr window require a number a problem. Specifically, what if someone wants to play a sliding window DVR but doesn't know the window is because they offloaded all the video stuff to some service.
I think there should be a way to say "I want this to have the DVR UI, but figure out the window from the content yourself"

@cjpillsbury
Copy link
Collaborator Author

Thinking about it some more, I think that having the dvr window require a number a problem. Specifically, what if someone wants to play a sliding window DVR but doesn't know the window is because they offloaded all the video stuff to some service.
I think there should be a way to say "I want this to have the DVR UI, but figure out the window from the content yourself"

Would this be a problem if they relied on the "inferred" value use case described above?

@gkatsev
Copy link
Member

gkatsev commented Nov 18, 2022

I guess for me, it's the expectation of the type of UI that is being shown based on these configurations.
For stream-type="on-demand" it should show the regular UI we're used to with a start time and duration.
For stream-type="live", by default, it should show a simpler UI without a progress bar or other timings. But, I'd like to configure it with DVR, matching the HLS event type, where the UI looks most like an on-demand stream, where there is a progress bar and the times start at 0. In addition, I'd like a second DVR UI for a sliding window which shows like the last 30 seconds or 2 hours or whatever the stream is configured as, however, I don't want to know what the stream is set to.

@heff
Copy link
Member

heff commented Nov 19, 2022

Let's use "on-demand"

👍

let's assume the "nil" cases are either...strict and always

I like strict, and should probably never be undefined unless actually not implemented. e.g. duration = NaN, srcObject=null. Then we can detect when this isn't implemented.
streamType: null
dvrWindow: NaN (assuming it's always a Number otherwise)

Properties are (can be?) inferred based on the media content but are overridable via a setter (aka not read only)

That feels like it could get complicated. At the UI layer (media-chrome) you could certainly decide to ignore the media element values, but setting the stream type on the media element itself is like saying "you think you're playing vod, but you're really playing live". It's open to interpretation how the media element should handle that.

Not sure these should be part of the media-ui-extensions formalization, but attributes could be either

Yeah, probably not part of media-ui-extensions since media elements don't push state out to attributes.

@gkatsev I'm following everything except "I want this to have the DVR UI, but figure out the window from the content yourself"... "however, I don't want to know what the stream is set to".

Are we talking about:

  • Wanting to configure the window manually
  • Not having access what window is available in the manifest
  • Something else?

How would one "figure out the window from the content" if the media element isn't reporting that detail through a property like dvrWindow?

Finally, alt proposal for dvrWindow is liveWindow, for similar reasons to vod/on-demand.

@gkatsev
Copy link
Member

gkatsev commented Nov 19, 2022

I think I may have complicated things by not being extra clear in my thoughts, and also maybe not verifying the specific constraints on this proposal.
Basically, there are two issues at hand:

  1. as a player developer, I want to know what the stream type is to display things accordingly.
  2. As a player user, I want to a specific UI for the media I'm putting on my site.

For 1, the stream type and live window stuff can generally be figured out from the underlying video data, like duration being Infinity means live and the live window is seekableEnd-seekableStart.
For 2, we want to be able to provide this data from the outside. What I meant by "however, I don't want to know what the stream is set to" is that a player user may not know how a particular live stream is configured in terms of number of segments and segment durations and just wants to be able to configure the player to show a particular UI. Mux Player is such an example, because you can set stream-type and get the corresponding UI, regardless of what the video actually is.

Hopefully, that clarifies things.

@heff
Copy link
Member

heff commented Nov 28, 2022

@gkatsev yep, thanks

like duration being Infinity means live and the live window is seekableEnd-seekableStart

Do we need this new API then? Media chrome, Mux Player, and other players can of course add some sugar to make working with different stream types easier, but for the sake of media elements specifically, do we already have what we need to determine stream type and the dvr window? Is seekable missing anything?

@gkatsev
Copy link
Member

gkatsev commented Nov 28, 2022

So, would every component need to check if duration is Infinity and what the seekable is before doing anything?

Also, with hls.js for example, the seekable end is slightly different from hls.js's liveSyncPosition (I'm not exactly sure why, but that's another matter). This could get pushed down into the slotted media element implementation.

Actually, this brings up the question: is this a property that's supposed to be exposed from the media element?

Maybe the solution to my dichotomy is that media-chrome should use the media element's provided stream-type unless media-controller was given a stream-type via an attr? Separating the two this way also makes it so that there isn't a concern about setting the property from inside, while also making it be settable from the outside.

@heff
Copy link
Member

heff commented Nov 28, 2022

is this a property that's supposed to be exposed from the media element?

Yes. For context, this whole repo is about "Extending the HTMLVideoElement API". Any conversations about media chrome or how a player would use the API should only be to inform the media element API design. i.e. this isn't the forum to solve media chrome specific things, and if we're headed that route we should push it over to a media chrome issue.

would every component need to check if duration is Infinity and what the seekable is before doing anything

In the media chrome case, no. Only media-controller would check the media element's properties, and then it would translate it into stream type, etc for other components. I feel like that's fine. It's a whole other thing to say every [slotted] media element has to do that translation work and expose a new API for the result.

Also, with hls.js for example, the seekable end is slightly different from hls.js's liveSyncPosition (I'm not exactly sure why, but that's another matter). This could get pushed down into the slotted media element implementation.

Yeah, that's interesting. I think we'd expect custom media elements make their own seekable property match what's intended to be seekable for the dvr window, meaning not just pass through the native video element's seekable data if it's not quite right. Then it'd be good to understand if the native video element needs to fixed somehow, per browser, to support live windows better.

@cjpillsbury
Copy link
Collaborator Author

Since this is intended for media-ui-extensions, I'm hesitant to conflate data APIs with UI, as these can come apart (e.g. there may be needs to have a programmatic seekable that is a distinct value from liveWindow (given e.g. the way MediaSource or other non-src values can work). Similarly, even if we don't want this to be a part of the media-ui-extensions, I suspect we'll want/need setters for these values. For example, there is no guaranteed, in-spec inferable way with MPEG-DASH to distinguish between a small seekable window in live ("dynamic") content to avoid stalls/account for latency vs. "DVR"/"sliding window". Having these values be settable allows a developer to announce how they want the UI to be presented:

  • in advance of the asynchronous process of loading/parsing/etc. the media content
  • to disambiguate in cases where the media content can't sufficiently do so
  • to override a default "presentation" of the UI based on the developer's desired effect (e.g. "make the player look live, even if the content is on demand")

@cjpillsbury
Copy link
Collaborator Author

Just to wrap this up, I'm going to pin down what we have so far:

Stream Types

Interface

  • Property Name: streamType
  • Valid Values: "live" | "on-demand" | null | undefined
  • Event Type: streamtypechange, detail = streamType

Inferred

  • "live" - HTMLMediaElement::duration === Number.POSITIVE_INFINITY
  • "on-demand" - Number.isFinite(HTMLMediaElement::duration)
  • Change Condition: HTMLMediaElement durationchange Event

HAS-Specific: HLS

(assumes a media playlist for the current src has been loaded at least once)

  • "live" - !#EXT-X-PLAYLIST-TYPE || #EXT-X-PLAYLIST-TYPE:EVENT
  • "on-demand" - #EXT-X-PLAYLIST-TYPE:VOD

HAS-Specific: MPEG-DASH

(assumes the manifest MPD for the current src has been loaded at least once)

  • "live" - MPD@type="dynamic"
  • "on-demand" - !MPD@type || MPD@type="static"

DVR

DVR will be modeled separately from streamType as a boolean.

Interface

  • PropertyName: dvr
  • Valid Values: true | false | null | undefined
  • Event Type: dvrchange, detail = dvr

Inferred DVR

TBD

Out of scope proposal:
HTMLMediaElement::seekable.end(0) - HTMLMediaElement::seekable.start(0) >= DVR_WINDOW_SIZE
where DVR_WINDOW_SIZE is some determined duration threshold sufficiently large to count as DVR (or "sliding window") and may potentially be configurable via a property or attribute

HAS-SPECIFIC: HLS

(assumes a media playlist for the current src has been loaded at least once)

  • true - #EXT-X-PLAYLIST-TYPE:EVENT
  • false - !#EXT-X-PLAYLIST-TYPE:EVENT

HAS-SPECIFIC: MPEG-DASH

TBD

Out of scope proposal:
MPD@timeShiftBufferDepth > DVR_WINDOW_SIZE
where DVR_WINDOW_SIZE is some determined duration threshold sufficiently large to count as DVR (or "sliding window") and may potentially be configurable via a property or attribute

@gkatsev
Copy link
Member

gkatsev commented Dec 7, 2022

Valid Values: "live" | "on-demand" | null | undefined

why have both null and undefined here?

Inferred DVR

TBD

Wouldn't this be seekable.start(0) doesn't change?

@heff
Copy link
Member

heff commented Dec 7, 2022

why have both null and undefined here?

undefined would essentially mean "unimplemented". That will probably be true for any new media-ui-extension. null means implemented but unknown.

@gkatsev
Copy link
Member

gkatsev commented Dec 7, 2022

Would it be better to have "unknown" be the unknown value rather than null to keep the type the same except for when there's no support for the feature? i.e., it'll make the type be string | undefined.

@heff
Copy link
Member

heff commented Dec 7, 2022

It's worth noting that you can just rely on duration for stream type, but there's value in a specific streamType property, because of the async nature of duration. The player might know the stream type before duration is set, even from other metadata about the video.

Would it be better to have "unknown" be the unknown value rather than null

I can see that making sense and it's more direct. But null also matches "no attribute". All the aria props (strings) default to null. If it was a string "unknown" there might be temptation to sprout that value to an attribute? Although...this is probably property only, not an attribute, since it's not user configurable.

@heff
Copy link
Member

heff commented Dec 7, 2022

DVR

@cjpillsbury thanks for writing this up. I'm still not totally clear on the reasons why this is needed from a media element in addition to seekable, and how it would be used. Is it basically "this media is meant to be accessible as DVR (have a progress bar), no matter what the seekable range might be right now"?

@cjpillsbury
Copy link
Collaborator Author

Is it basically "this media is meant to be accessible as DVR (have a progress bar), no matter what the seekable range might be right now"

Yeah, it solves that problem and makes it easier definitively disambiguate between live (which always has some seekable range) and dvr for e.g. hls.

@cjpillsbury
Copy link
Collaborator Author

@gkatsev @heff I propose we treat "sliding window" and its relationship to dvr as out of scope for this discussion.

@cjpillsbury
Copy link
Collaborator Author

@gkatsev @heff since there's still (out of band) discussion about "DVR" more generally, I propose we also treat DVR as out of scope for this discussion. I've written up a google doc discussing some of the complexities and considerations around DVR, available for comment here. I'll go ahead and start a separate github discussion specifically for DVR with a link to the google doc.

@cjpillsbury
Copy link
Collaborator Author

Assuming we descope all DVR from the Stream Types discussion, I believe we are close to finalizing this proposal for "live" and "on-demand". The only potential disagreement that remains is how we should represent the "unknown" case where streamType is supported by the media element. The two proposals here are:

  1. null
  2. "unknown"

I have a slight leaning toward (2) to make it explicit, though I'm amenable to either. @gkatsev @heff I will defer to whatever you two think is the better value.

Once this is decided, let me know if there is anything else outstanding to finalize this proposal. Otherwise, let's make this decision and finish our first media-ui-extensions proposal 🎉.

@gkatsev
Copy link
Member

gkatsev commented Dec 8, 2022

I would lean towards 2 "unknown" as well. Prior art:

@heff
Copy link
Member

heff commented Dec 9, 2022

I'm good with unknown. In media-chrome I don't think we should follow that pattern for the attribute, and we'll stick with "no media-stream-type attribute" (getAttribute returns null) means unknown or unimplemented by the media. But that's a different problem space. If there's disagreement with that we can followup in the media chrome thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants