Basic catalog and track selection. #63

kixelated · 2023-01-09T21:01:28Z

No longer trying to boil the ocean. An endpoint advertises broadcasts and tracks via the CATALOG message and the other endpoint chooses the tracks via the SUBSCRIBE message.

TODO:

Default tracks (avoid a round-trip)
Debate merging broadcast ID and URL.
Broadcast/track updates and termination.
Broadcast/track metadata.
Sender-side ABR.

No longer trying to boil the ocean. An endpoint advertises broadcasts and tracks via the CATALOG message and the other endpoint chooses the tracks via the SUBSCRIBE message. TODO: * Default tracks (avoid a round-trip) * Debate merging broadcast ID and URL. * Broadcast/track updates and termination. * Broadcast/track metadata. * Sender-side ABR.

afrind

It's straightforward, which I like. Can the sender of a CATALOG also unilaterally begin sending OBJECT messages on tracks of its choice? How does the receiver stop those, or stop other track subscriptions it is no longer intereted in.

draft-lcurley-warp.md

afrind · 2023-01-09T22:25:13Z

draft-lcurley-warp.md

-An integer indicating the delivery order ({{delivery-order}}).
-This field is optional and the default value is 0.
+* Broadcast URL:
+An optional, globally unique identifier for the broadcast.


Do we need to say more about the format or reference another RFC?

No idea. I'll look for examples.

I'm just going to leave this barren for now and only reference the URI spec. I'm sure somebody will care about sematics and expand.

afrind · 2023-01-09T22:28:16Z

draft-lcurley-warp.md

@@ -427,17 +424,21 @@ Warp Message {

 The Message Length field contains the length of the Message Payload field in bytes.


Not for this PR but we should describe what to do when the message length doesn't match the actual length (eg: too many track IDs in a SUBSCRIBE)

Oh wow I didn't even see this. Yeah we shouldn't have this length because it's 1. redundant and 2. we don't always know the full size for each OBJECT.

Plus I'm less of a fan of the specifications that always have a box length (ex. TLS, MP4). It's absolutely more difficult to encode and I'd rather go the QUIC style where versions/extensions are explicitly negotiated.

Where an explicit message length can help is for writing parsers that aren't smart enough to parse a partial message. Eg: you buffer enough to read the type and length, then buffer up the the message payload, then parse. That makes more sense for smaller messages or endpoints that will need the entire message anyways. For OBJECT it doesn't make sense I agree.

You can remove the redundancy by removing the number of track IDs from subscribe, the parser just parses track IDs until it hits the message length (though there's still a case where the varint length exceeds the remaining message bytes).

Yeah, it makes decoding much easier, since you can skip unknown message type or extensions.

But it makes encoding harder because you either* have to allocate a new chunk of memory for each message, or you have to go back and populate the size afterwards. The later isn't really an option when you use varints since the number of bytes for the size is variable.

* you can pre-compute the size before encoding but it's also gross

Yeah, it makes decoding much easier, since you can skip unknown message type or extensions.

Oh I forgot about this. It's a killer feature we probably don't want to lose, except probably for OBJECT. Once you send an OBJECT message, the length is "til the end of the stream".

it makes encoding harder

There's a few options. You can serialize full-size (8 byte) varints every time. You can also reserve the max varint size, serialize the frame, then prepend a the varint and set the start of the message offset accordingly. Another trick we use is vectored writes, where we serialize the size and the payload in different buffers which are chained together.

I'm thinking we separate control streams from data streams. That seems to be the general consensus, but it's not written in the draft. Maybe control streams could have a message length while data streams do not.

But in general I want to copy QUIC/HTTP3. I really like the encoding and I generally assume the Media over QUIC working group does too. That means no message length for now but we should absolutely tackle the versioning and backwards compatibility issue.

I think that's a really good idea - have a stream type that identifies it as a control stream or object stream. Control streams are framed, and all messages have lengths. Unknown messages are skipped/ignored. Object streams could be entirely unframed, or could have some frames up front and transition to unframed when the object starts.

kixelated · 2023-01-09T22:53:41Z

Can the sender of a CATALOG also unilaterally begin sending OBJECT messages on tracks of its choice? How does the receiver stop those, or stop other track subscriptions it is no longer intereted in.

Not yet. I decided to punt on that since it is an optimization with no clear approach. It could be as simple as adding a "default" flag to each track in the CATALOG, but it should be a separate PR.

kixelated · 2023-01-10T00:18:09Z

draft-lcurley-warp.md

 OBJECT Message {
+  Broadcast ID (i),


@wilaw From our discussion, one of your performance goals is to support OBJECT passthrough within a CDN. The idea was to use the broadcast URL if we know that its globally unique.

But what if you had an entry point into the video system that sanitizes the broadcast ID, making it unique within the video system? We could specify that broadcast ID SHOULD be random, and in the rare case of a collision, the ingest server could either have to rewrite each OBJECT or maybe ask that the client use a different broadcast ID?

Once that's been done, then the OBJECT can be passed through the rest of the video system without modification.

The bonus upside is that each viewer could then have a different broadcast URL while sharing the same OBJECT messages. This would be very useful for server-side ad insertion, for example, where user A and user B are watching two similar but distinct broadcasts. And of course it avoids sending the broadcast URL for each OBJECT, which will definitely add up when delivering small content like audio frames or datagrams.

But what if you had an entry point into the video system that sanitizes the broadcast ID, making it unique within the video system

That assumes the hypothetical entry point as visibility across the entire system. Uniqueness doesn't only have to be assured within the receiving relay but all the way through to the final edge node. That isn't practical in a multi-vendor system, for example, two CDNs stacked on top of one another with the first acting as an origin shield. It also isn't practical in a single vendor global distributed system either without some centralized registry of active broadcastIDs, which introduces its own problems of time-to-resolution and consistency. I suspect systems would resort to a hierarchical naming scheme to ensure uniqueness. Throw in a vendor ID on top of that and now you are back to a GUID.

in the rare case of a collision, the ingest server could either have to rewrite each OBJECT or maybe ask that the client use a different broadcast ID?
We really don't want relays to have rewrite objects - that will lower their throughput. Asking the publisher to use a different broadcast ID would be preferred, but then you are back to the problems described above in ensuring uniqueness.

Stepping back, a globally unique identifier in each object simplifies object routing, caching and forwarding across heterogeneous systems. Its primary penalty is the size. I would contend that with GOP-per-object and frame-per-object at video bitrates this size is workable. The worst (and perhaps the design case) is low bitrate audio streams and datagrams, in which the object size will be in the order of 1500 bytes, in which case the 128 bit ID represents a ~8.5% reduction in the payload. So you need additional datagrams to transmit each sample/frame. Not ideal, but still workable according to Cisco as proponents of datagram-per-object that I have discussed this with. I have separate concerns however about routing these small objects at scale. It would be good to do some testing and establish performance benchmarks on a prototype.

The bonus upside is that each viewer could then have a different broadcast URL while sharing the same OBJECT messages.

I think this same decoupling can be achieved by having a catalog with multiple subscription options and allowing the subscriptions to be absolute paths rather than only relative ones. In the example below are two catalogs. Each refers to the same primary sports streams, however both inject a custom subscription for ad content.

Catalog customcatalogserver.com/streamid12345/unique-userID1111 Subscription #0: yourverybestadcontent.com/customadcontent/1234 Subscription #1: primarycontent.com/sports/football/game23/1080 Subscription #1: primarycontent.com/sports/football/game23/720

Catalog customcatalogserver.com/streamid12345/unique-userID2222 Subscription #0: yourverybestadcontent.com/customadcontent/5678 Subscription #1: primarycontent.com/sports/football/game23/1080 Subscription #1: primarycontent.com/sports/football/game23/720

This would be efficient for the relay as it would retrieve and store the primary content only once.

I think ads may be better served as VOD assets downloaded via subscription ahead of an insertion point, with the insertion point triggered in-stream, but that's probably commentary for a different PR.

How about we merge ID and URL? I think they're both the same thing conceptually, except a varint has limited space so any encoding hierarchy is far more primitive. A variable-length ID is definitely required and I'm fine using a formatted string (URL).

But what if you had an entry point into the video system that sanitizes the broadcast ID, making it unique within the video system

That assumes the hypothetical entry point as visibility across the entire system. Uniqueness doesn't only have to be assured within the receiving relay but all the way through to the final edge node. That isn't practical in a multi-vendor system, for example, two CDNs stacked on top of one another with the first acting as an origin shield. It also isn't practical in a single vendor global distributed system either without some centralized registry of active broadcastIDs, which introduces its own problems of time-to-resolution and consistency. I suspect systems would resort to a hierarchical naming scheme to ensure uniqueness. Throw in a vendor ID on top of that and now you are back to a GUID.

You always need some authority to ensure that an ID is unique. For example, if I push moq://kixelated.io/12345 to two different hosts, one of them is going to have to say "no".

A global database is the easiest way of doing this but like you mentioned, it doesn't scale. Generally you shard the ID into multiple authorities. The client doesn't know about this sharding scheme, so it needs to be told which ID to use by some authority. That could be built into the protocol, similar to how connection IDs work in QUIC: the client chooses one at random and then could get told by the server to use a different ID.

But things get hairy when multiple vendors are involved because now there are multiple authorities. A broadcast ID chosen for the Twitch system won't be applicable for the Akamai system unless there is a standardized scheme. That's possible, but I think not allowing vendors to rewrite the ID would be a constraint, not a benefit.

All I'm trying to say is that relay accepting connections from 3rd parties should be prepared to verify the broadcast ID/URL. If there's a collision, the relay either needs to rewrite the ID or reject the broadcast outright.

in the rare case of a collision, the ingest server could either have to rewrite each OBJECT or maybe ask that the client use a different broadcast ID?
We really don't want relays to have rewrite objects - that will lower their throughput. Asking the publisher to use a different broadcast ID would be preferred, but then you are back to the problems described above in ensuring uniqueness.

Stepping back, a globally unique identifier in each object simplifies object routing, caching and forwarding across heterogeneous systems. Its primary penalty is the size. I would contend that with GOP-per-object and frame-per-object at video bitrates this size is workable. The worst (and perhaps the design case) is low bitrate audio streams and datagrams, in which the object size will be in the order of 1500 bytes, in which case the 128 bit ID represents a ~8.5% reduction in the payload. So you need additional datagrams to transmit each sample/frame. Not ideal, but still workable according to Cisco as proponents of datagram-per-object that I have discussed this with. I have separate concerns however about routing these small objects at scale. It would be good to do some testing and establish performance benchmarks on a prototype.

Yeah, I think the play is to let the ID be variable length like a QUIC connection ID. When OBS connects to Twitch it could use a fully qualified URL (~50 bytes), but internally we could rewrite it to some much smaller encoding if the overhead is a problem. In our system, I imagine the $ cost of transit would be greater than CPU cost of rewriting the ID at origin.

At the end of the day, the ID only needs to be unique within the session. It's a bonus if there's additional information encoded in it, but that requires some pre-negotiated scheme. Maybe to stream to Twitch/Akamai a broadcaster would have to use a vendor-specific scheme. But without a standardized scheme, a generic client like OBS should be able to use any broadcast ID.

The bonus upside is that each viewer could then have a different broadcast URL while sharing the same OBJECT messages.

I think this same decoupling can be achieved by having a catalog with multiple subscription options and allowing the subscriptions to be absolute paths rather than only relative ones. In the example below are two catalogs. Each refers to the same primary sports streams, however both inject a custom subscription for ad content.

Catalog customcatalogserver.com/streamid12345/unique-userID1111 Subscription #0: yourverybestadcontent.com/customadcontent/1234 Subscription #1: primarycontent.com/sports/football/game23/1080 Subscription #1: primarycontent.com/sports/football/game23/720

Catalog customcatalogserver.com/streamid12345/unique-userID2222 Subscription #0: yourverybestadcontent.com/customadcontent/5678 Subscription #1: primarycontent.com/sports/football/game23/1080 Subscription #1: primarycontent.com/sports/football/game23/720

This would be efficient for the relay as it would retrieve and store the primary content only once.

I think ads may be better served as VOD assets downloaded via subscription ahead of an insertion point, with the insertion point triggered in-stream, but that's probably commentary for a different PR.

Seems fine. That would mean giving each track a URL and removing the broadcast ID/URL from the OBJECT message. Or adding another layer to the heirarchy.

Hmm, I think a URI might be a middle ground. The schema would declare how it's structured, and you could have a schema that is more restrictive?

You always need some authority to ensure that an ID is unique. For example, if I push moq://kixelated.io/12345 to two different hosts, one of them is going to have to say "no".

We have a global system for validating uniqueness - DNS + TLS certs. Perhaps we can leverage that for MOQ. What if we define the moq broadcast URI to serve double-duty as the HTTPS verification API for permission to publish? As an example, if the client contacts any ingest server and asks to publish https://kixelated.io/12345, the ingest server first makes an HTTPS request to https://kixelated.io/12345. The request is probably accompanied by an access token given to it out-of-band to protect that API endpoint. The ingest server expects to get back some confirmation response that we define that authorizes it to accept the publication. There could be other factors besides uniqueness that kixelated may wish to enforce (geo, time-of-day restrictions etc). . This saves the distribution system from having to maintain global uniqueness, by delegating all streams purporting to come from the kixelated domain to kixelated. If the ingest server is only authorized to accept streams for rainbows.com, then it could reject the publish request immediately. THis could work in reverse too - the publishing app must first communicate with kixelated.com to get assigned a unique ID along with an access token, which it then hands to the ingest server when requesting publication permission.

That's a possibility, although a certificate doesn't prevent duplicates. And maybe the user is authenticated and has a valid certificate from rainbows.com, but they doesn't mean they are well-behaved (ex. use the same broadcast/cert twice). If I'm running a service like Twitch that accepts broadcasts from users, then I need to assume they will be malicious.

A relay can only forward if it trusts the source. I think there's cases when you can do that (ex. superbowl) and public key crypto is an option. But it can't be a requirement, because I absolutely do not trust any of my users.

kixelated · 2023-01-10T00:51:42Z

draft-lcurley-warp.md

+
+~~~
+SUBSCRIBE Message {
+  Broadcast ID (i),


@suhasHere as I understand, QUICR gives each object what is effectively a path so you can do wildcard subscriptions. So you could do stuff like:

SUBSCRIBE broadcastA/track* OBJECT broadcastA/track1/frame1 OBJECT broadcastA/track1/frame2 OBJECT broadcastA/track2/frame1 OBJECT broadcastA/track2/frame2

One thing I realized after talking to @wilaw is that subscribing to multiple broadcasts is probably out of scope. For example:

SUBSCRIBE broad* OBJECT broadcastA/track1/frame1 OBJECT broadcastB/track1/frame1 OBJECT broadcastC/track1/frame1 <cdn implodes>

It miiight be a valid use-case but it seems questionable. I think a subscription can be safely scoped to a single broadcast. This seems required to subscribe based on media timestamp or something else broadcast specific. It doesn't matter if it's Broadcast ID or Broadcast URL; either works so long as they match what's in the CATALOG.

Okay so hypothetically we've got the broadcast ID as a separate field:

SUBSCRIBE broadcastA, track* OBJECT broadcastA, track1/frame1 OBJECT broadcastA, track1/frame2 OBJECT broadcastA, track2/frame1 OBJECT broadcastA, track2/frame2

But it doesn't actually make sense to use wildcards for the frame number. It's ambiguous, or doesn't give you enough control. For example:

SUBSCRIBE broadcastA, track1/frame2* OBJECT broadcastA, track1/frame2 OBJECT broadcastA, track1/frame20 OBJECT broadcastA, track1/frame21

It's safe to say that you want something more expressive, probably related to timestamps. Wildcards miiight work with enough masks but probably not.

Okay so let's split the object ID into its own field. It's "object ID" in this draft but it could reasonably be "sequence" or "timestamp".

SUBSCRIBE broadcastA, track1, frame > 23 OBJECT broadcastA, track1, frame24 OBJECT broadcastA, track1, frame25

Now we're left with track IDs as their own field and not part of a larger path. We certainly need the ability to select individual tracks (no wildcard) and I'd like to start with this functionality. We can always explore how to more efficiently subscribe to multiple tracks if enumerating them is too costly.

One thing I realized after talking to @wilaw is that subscribing to multiple broadcasts is probably out of scope.

Actually it should work. Assuming there is no semantic meaning in the store around the '/' boundary, it's not harder to do a match against

SUBSCRIBE broad*

than is is against

SUBSCRIBE broadcastA/*

Conceivably you could do

SUBSCRIBE *

and get everything.

What we were talking about was slightly different and related to cache entry stores and serving archival content in sequence. A HTTP cache data structure is highly optimized to return a quick answer to "is the object described by this key available?" . As a consequence it is expensive to search across a cache and say "find me the last keyframe in this sequence of objects". The application was a player wanting to join a subscription from the prior GOP boundary. I think relay caches should be structured differently from HTTP caches. They could hold some form of linked list, as invariably we went to retrieve objects in a sequence. We should also store a simple flag with each object, indicating whether it is a "starting" object or not. We could then retrieve "broadcastA/track1/* starting from the last flagged keyframe". The relay would serve everything in its cache matching that request and then all future objects satisfying that filter.

One thing I realized after talking to @wilaw is that subscribing to multiple broadcasts is probably out of scope.

Actually it should work. Assuming there is no semantic meaning in the store around the '/' boundary, it's not harder to do a match against

SUBSCRIBE broad*

than is is against

SUBSCRIBE broadcastA/*

Conceivably you could do

SUBSCRIBE *

and get everything.

I envision that relays will conditionally subscribe to broadcasts. We've got thousands of edge nodes that do NOT want a copy of every broadcast. It's too much data for any individual host so we coalesce traffic based on the broadcast. A relay/edge would SUBSCRIBE only to specific broadcasts based on demand.

It might be possible to create a sharded URL scheme so a relay/edge would only need to SUBSCRIBE once on startup using a wildcard but I highly doubt it. There's just too many variables involved even for the most expressive rule engine.

SUBSCRIBE * seems fine for a video system that only serves extremely popular broadcasts, like the superbowl or something, where the data needs to get replicated to every host. But it's barely an optimization compared to SUBSCRIBE superbowl.

I think subscribing to all tracks within a broadcast has more merit. It effectively pre-warms the cache, since there's a high likelihood that someone watching a specific broadcast will change tracks. I understand the hesitation to group objects based on broadcast, but even with a generic path you're going to do that anyway (hence the wildcard).

What we were talking about was slightly different and related to cache entry stores and serving archival content in sequence. A HTTP cache data structure is highly optimized to return a quick answer to "is the object described by this key available?" . As a consequence it is expensive to search across a cache and say "find me the last keyframe in this sequence of objects". The application was a player wanting to join a subscription from the prior GOP boundary. I think relay caches should be structured differently from HTTP caches. They could hold some form of linked list, as invariably we went to retrieve objects in a sequence. We should also store a simple flag with each object, indicating whether it is a "starting" object or not. We could then retrieve "broadcastA/track1/* starting from the last flagged keyframe". The relay would serve everything in its cache matching that request and then all future objects satisfying that filter.

I alluded to this earlier, but "give me the most recent key frame" is probably not good enough. A higher latency system like Twitch will want "give me the last 4 seconds of media" which might be multiple GoPs (the client doesn't know). There's also a desire to support DVR, which means "give me media starting at 1:23".

That's a compelling reason to group by broadcast, as all tracks within a broadcast are timestamp aligned. The SUBSCRIBE message could then contain a presentation timestamp, potentially relative to the start/end of the broadcast.

vasilvv

Overall LGTM.

vasilvv · 2023-01-13T16:09:41Z

draft-lcurley-warp.md

@@ -172,7 +181,7 @@ A Warp broadcast is globally identifiable via a URI. Within the broadcast, every

 Depending on the profile of the application using it, Warp supports both a mode of operation where the peer unilaterally sends a broadcast with the media tracks of its choice, and a mode where the peer has to explicitly subscribe to a broadcast and select media tracks it wishes to receive.

-As an example, consider a scenario where `example.org` hosts a simple live stream that anyone can subscribe to. That live stream would be a single Warp broadcast identified by the URL `https://example.org/livestream`. In the simplest implementation, it would provide only two media tracks, one with audio and one with video. In more complicated scenarios, it could provide multiple video formats of different levels of video quality; those tracks would be variants of each other. Note that the track IDs are opaque on the Warp level; if the player has not received the description of media tracks out of band in advance, it would have to request the broadcast description first.
+As an example, consider a scenario where `example.org` hosts a simple live stream that anyone can subscribe to. That live stream would be a single Warp broadcast identified by the URI `warp://example.org/livestream`. In the simplest implementation, it would provide only two media tracks, one with audio and one with video. In more complicated scenarios, it could provide multiple video formats of different levels of video quality; those tracks would be variants of each other. Note that the track IDs are opaque on the Warp level; if the player has not received the description of media tracks out of band in advance, it would have to request the broadcast description first.


Is there a reason we introduce a new URI scheme? If we want those to be valid WebTransport URIs, those should be https.

I'll revert it. I don't really care about URI semantics but somebody might.

vasilvv · 2023-01-13T16:24:07Z

draft-lcurley-warp.md

+
+
+Only the most recent SUBSCRIBE message for a broadcast is active.
+SUBSCRIBE messages MUST be sent on the same QUIC stream to preserve ordering.


Actually, what happens when we receive multiple CATALOG messages?

Undefined for now. CATALOG messages contain init segments, and you can't change those without potentially
breaking the decoder. We need a separate PR on how to add/remove/update tracks.

"be sent on the same QUIC stream" the same as ...? SETUP?

I'll make a PR to address the stream ambiguity. There has to be at least one control stream per broadcast. If there's only a single control stream per broadcast (probably?), then this text can be removed

kpugin · 2023-01-13T21:32:53Z

draft-lcurley-warp.md


 ## Streams
 Warp endpoints communicate over QUIC streams. Every stream is a sequence of messages, framed as described in {{messages}}.

-The first stream opened is a client-initiated bidirectional stream where the peers exchange SETUP messages ({{setup}}). The subsequent streams MAY be either unidirectional and bidirectional. For exchanging media, an application would typically send a unidirectional stream containing a single OBJECT message ({{object}}).
+The first stream opened is a client-initiated bidirectional stream where the peers exchange SETUP messages ({{message-setup}}). The subsequent streams MAY be either unidirectional and bidirectional. For exchanging media, an application would typically send a unidirectional stream containing a single OBJECT message ({{message-object}}).


does this mean that client (player) cannot get video frame in 1 RTT? (sorry, I might be a bit late and this pull request is not the right place to ask this)

The WebTransport CONNECT request requires 1 RTT already. But the client is allowed to write streams while the CONNECT request is in flight, and the server is allowed to write streams while the CONNECT response is in flight.

So with a good WebTransport implementation, this SETUP doesn't incur an extra round trip. If it was a server-initiated bidirectional stream then it would. Maybe we can explicitly put the SETUP message in the WebTransport CONNECT or something to avoid this being a possibility though.

Although to be clear, this PR does introduce an extra round trip because the receiver must explicitly send SUBSCRIBE after receiving a CATALOG message. I would like to improve that though, like by advertising the default subscription in the CATALOG, but there were enough options that I'm saving it for another PR.

^ both a broadcaster and viewer would incur an extra RTT as a result.

kpugin · 2023-01-13T21:37:34Z

draft-lcurley-warp.md


 ## Prioritization
 Warp utilizes stream prioritization to deliver the most important content during congestion.

-The encoder may assign a numeric delivery order to each stream ({{delivery-order}})
+The encoder may assign a numeric delivery order to each object ({{delivery-order}})


nit: "encoder" might be too limiting here - there are likely cases where CDN node might decide to splice something in bitstream - it's not always encoder

kpugin · 2023-01-13T21:49:53Z

draft-lcurley-warp.md

+|-----:|:-------------------|
+| 0x0  | Session Terminated |
+|------|--------------------|
+| 0x1  | GOAWAY             |


from our experience we found it's very useful to have more than "error code" - can we extend error to be error code and some text blob?

For example: there are could be multiple reasons why session was terminated - it could have been terminated by broadcaster, it could have been terminated by vendor, for example copy-right match

Yeah, I think WebTransport supports an int code and string reason.

We only need to standardize the int code though.

kpugin · 2023-01-13T21:52:34Z

draft-lcurley-warp.md


-Here, Track ID is the unique ID that identifies the track within the broadcast. Track Format idetifies the format used by the track; for the ISOBMFF-based format described in this document, 0x00 is specified. In the case 0x00 is used, Format-Specific Metadata contains a `ftyp` box followed by a `moov` box.
+* GOAWAY:
+The endpoint successfully drained the session after a GOAWAY was initiated ({{message-goaway}}).


@afrind hm... our semantics of GOWAY is not session termination, right? It's an instruction to endpoint to migrate to new connection because current one will be drained

This isn't a GOAWAY message, rather it's an error code to be passed to the WebTransport close message (which is like QUIC CONNECTION_CLOSE). In that case, I'm not sure the value of having a GOAWAY code here, either the session drained or it didn't, right?

The idea was to handle the case when a broadcast terminates in the middle of a GOAWAY. "I migrated" versus "I didn't need to migrate".

Do you think I should remove GOAWAY and make a separate PR? I kind of snuck it in as we were reaching the draft deadline but it's absoutely out of place.

kpugin · 2023-01-13T21:56:09Z

draft-lcurley-warp.md

@@ -419,25 +423,26 @@ Both unidirectional and bidirectional Warp streams are sequences of length-delim
 ~~~
 Warp Message {
  Message Type (i),
-  Message Length (i),


what happened to length? how relays or endpoints know message boundaries?

+1 I think we need to leave this in. In the short term, just set it to 0 for OBJECT and remove in another PR?

OBJECT messages can't have a length when they contain multiple frames. We could add a special case that 0 means until the end of the stream, but I think we default to QUIC encoding for now.

If relays need to proxy arbitrary messages, then yeah we should absolutely add a length. Same goes for backwards compatibility.

But yeah okay I'll make another PR. Posted my comment at the same time as Alan.

kpugin · 2023-01-13T21:58:13Z

draft-lcurley-warp.md

+| 0x10 | GOAWAY ({{message-goaway}})       |
+|------|-----------------------------------|
+
+## SETUP {#message-setup}

 The `SETUP` message is the first message that is exchanged by the client and the server; it allows the peers to establish the mutually supported version and agree on the initial configuration. It is a sequence of key-value pairs called *SETUP parameters*; the semantics and the format of individual parameter values MAY depend on what party is sending it.


I think we need 1RTT option - having endpoints exchanging information before actual media bytes starts flowing will hurt performance and is the regression from DASH playback or RUSH ingest

We could use some default values

kpugin · 2023-01-13T22:09:25Z

draft-lcurley-warp.md

 OBJECT Message {
+  Broadcast URI (b)


hm... this will be pretty size big overhead, for example if we want to have 1 frame CMAF audio chunks - that URI alone would be larger than payload :/

I agree but this is the compromise. The ask is that relays can blindly copy OBJECT messages rather than re-encode them. I strongly want to avoid any requirement that this URI is globally unique though.

I would conceptually think of this like a QUIC connection ID. An endpoint can use as many bytes as it wants, and potentially proxy the connection ID rather than terminate. OBS could push a single broadcast named moq://live or something.

But yeah, let's pick a direction and we can revisit it as we go.

We could go the "optional compression" route where if both client and server agree they can stick the URL in a table and transmit an ID here in place? Relays that prefer larger network overhead but fewer lookup tables and not support such compression?

kpugin · 2023-01-13T22:17:49Z

draft-lcurley-warp.md

 }
 ~~~
 {: #warp-object-format title="Warp OBJECT Message"}

-This document defines the following headers:
+* Broadcast URI:
+The broadcast URI as declared in CATALOG ({{message-catalog}}).


what is the role of this URI?

It's to identify that an OBJECT is part of a broadcast. The track ID alone is not enough since it's not unique across broadcasts.

We could make the track unique, for example track URI, and then we would no longer need the broadcast URI. However since the track URI would likely just be a superset of the broadcast URI, there's really no point.

kpugin · 2023-01-13T22:23:37Z

draft-lcurley-warp.md

+
+~~~
+CATALOG Message {
+  Broadcast URI (b),


I am not sure why we need this? I would imagine endpoint somehow obtains URI, for example: broadcasting client would obtain from the Twitch/Meta/VendorX API and then it can use this URI to open up session and within session it would go thru setup, etc. and would issue subscribe - why do we need to care URI everywhere? The only thing we need is multiplex multiple broadcasts on same WebTransport connection, but that would only need broadcast ID - why do we need whole URI?

I agree with @kpugin , this is how I imagine this working (pretty sure I'm missing something):

Ingest

The client/encoder gets a broadcastURI (or broadcastID) out of band

The client/encoder initiates a webTransport to server, adding that broadcastURI/broadcastID in the SETUP message

The server responds authorized or error

If authorized client / encoder (perhaps sends a SUBSCRIBE announcing the tracks) and starts sending media (or sends SUBSCRIBE + media in parallel while waiting for OK/ERR)

Note: I can NOT picture how CATALOG process works on the ingest case

Delivery

The clients/players gets a broadcastURI (or broadcastID) out of band (like HLS/DASH with manifest URL)

The client initiates a webTransport to server, adding that broadcastURI/broadcastID in the SETUP message, we can also send a CATALOG (or "SUBSCRIBE to defaults" without waiting setup response to avoid 1RTT more, and that will only be processed if SETUP is successful)

The server responds authorized or error

If authorized the server starts sending media (in case "SUBSCRIBE to defaults" was in the same stream)

(if no "SUBSCRIBE to defaults") Client sends CATALOG for that broadcastURI (or broadcastID)

Server responds with CATALOG response

Client SUBSCRIBES to desired variants

The server starts sending media

Not including HTTP/3, here's the diagram thus far:

Ingest

--> client CONNECT url=auth --> client SETUP role=producer <-- server CONNECT status=200 <-- server SETUP role=consumer --> client CATALOG broadcast=superbowl track=a track=b track=c <-- server SUBSCRIBE broadcast=superbowl track=b --> client OBJECT broadcast=superbowl track=b

I'm not sure if the client has to wait until the receipt of SETUP before it can send the CATALOG. If so, 2.5 RTTs to transfer the first frame.

Distribution

--> client CONNECT url=auth --> client SETUP role=consumer <-- server CONNECT status=200 <-- server SETUP role=producer <-- server CATALOG broadcast=superbowl track=a track=b track=c --> client SUBSCRIBE broadcast=superbowl track=b <-- server OBJECT broadcast=superbowl track=b

2 RTTs to transfer the first frame.

Getting the broadcast URI out of band involves some additional RTTs. That's possible and might even be necessary, but I'd like to support clients that don't pre-negotiate.

An easy way to avoid an RTT is to include default tracks in the CATALOG, especially if there's not really a decision to make.

@vasilvv @wilaw It would be nice if we could combine the CONNECT request/response and the SETUP message. They serve pretty similar roles.

One option is to define a WARP url such that any setup parameters are transmitted as part of the path. For example

https://://<application-defined ...>

https://twitch.com:4433/protocol=warp,v=1,role=consumer,sendDefault=true/stream12345

This would allow the server to immediately start sending the default tracks to the client upon authentication of the WT connection.

See prior comments on this PR that we could probably deprecate ROLE parameter to simplify the setup.

kpugin · 2023-01-13T22:27:48Z

draft-lcurley-warp.md

+
+
+## CATALOG {#message-catalog}
+The sender advertises an available broadcast and its tracks via the CATALOG message.


is there trigger for sender to send CATALOG message? Can sender send multiple CATALOG messages?

Yeah, multiple CATALOG messages for multiple broadcasts. Currently there's no way to update an existing CATALOG but we'll define something.

kpugin · 2023-01-13T22:30:11Z

draft-lcurley-warp.md


 Every parameter MUST appear at most once within the SETUP message. The peers SHOULD verify that and close the connection if a parameter appears more than once.

 The ROLE parameter is mandatory for the client. All of the other parameters are optional.

-## ROLE parameter
+## ROLE parameter {#role}

 The ROLE parameter (key 0x00) allows the client to specify what roles it expects the parties to have in the Warp connection. It has three possible values:


probably not related to this PR, but it would be useful to explain why role is needed? why can't everyone send and receive media?

Victor added it, but I think it's so an implementation doesn't need to support both roles. Kind of like a HTTP client versus a HTTP server. It's also just a nice declaration of intent.

I also believe that ROLE parameter is redundant. A consumer is any client that issues a SUBSCRIBE request. A producer is any client that sends OBJECT and CATALOG messages. Each client is going to have to the work at the application layer to protect itself against out-of-context, malformed or irrelevant messages that it receives. Declaring a role doesn't obviate a client from a receiver from validating all incoming messages.

kpugin · 2023-01-13T22:31:44Z

draft-lcurley-warp.md


-An initialization segment consists of a File Type Box (ftyp) followed by a Movie Box (moov).
+The "Container Init Payload" in a CATALOG message ({{message-catalog}}) MUST consist of a File Type Box (ftyp) followed by a Movie Box (moov).


hm.... do we really need to spell it out here? maybe just link to fMP4 ISO spec?

heh I copied this from the HLS spec.

kpugin · 2023-01-13T22:32:35Z

draft-lcurley-warp.md


-An initialization segment consists of a File Type Box (ftyp) followed by a Movie Box (moov).
+The "Container Init Payload" in a CATALOG message ({{message-catalog}}) MUST consist of a File Type Box (ftyp) followed by a Movie Box (moov).
 This Movie Box (moov) consists of Movie Header Boxes (mvhd), Track Header Boxes (tkhd), Track Boxes (trak), followed by a final Movie Extends Box (mvex).


interesting question what happens if this doesn't match to what was advertised in CATALOG message?

The init segment is delivered via the CATALOG message so they're one and the same.

kpugin · 2023-01-13T22:36:11Z

I think one thing is missing is some sort sequence diagram - it's hard to understand what as a sender or consumer I need to do

kpugin · 2023-01-13T22:41:14Z

draft-lcurley-warp.md

+
+~~~
+Track Descriptor {
+  Track ID (i),


don't we need track type? if there are 5 video tracks and 5 audio, I need to be able to get 1 audio and 1 video, but how I know which one is which?

That info (codec) is in the init segment, along with a bunch of other things that typically get sent in a playlist (ex. resolution, language, profile). It didn't seem useful to duplicate that information since it was already defined for fMP4.

If you want another format, ex. m3u8 or something to declare these parameters, then that should be a separate track format (currently called "container" but we'll rename). The "init payload" would be the m3u8 playlist.

Is there a way in fmp4 to define track groups? let me add an example of what I mean: We have a live stream with (2 video POV + 2 audio languages. All of them with 2 variants):

Video POV1

Variant1 1080p@30fps - 10Mbps

Variant2 720p@30fps - 6Mbps

Video POV2

Variant1 1080p@30fps - 10Mbps

Variant2 720p@30fps - 6Mbps

Audio POV1 EN

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV1 ES

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV2 EN

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV2 ES

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

How the player would know the different POV available, the different languages, and the relation between them. Is fmp4 / mp4 able to store those relations? (basically I'm asking for what transport stream solves with PAT - PMT tables, or other formats with manifests)

Note: I did NOT add subtitles to avoid more complexity

I'm not sure. There's certainly some metadata missing in fMP4 (like bitrate) that we'll have to encode somewhere. Maybe add some fields, maybe extend fMP4, maybe make a new container, who knows.

kpugin · 2023-01-13T22:41:57Z

draft-lcurley-warp.md

+CATALOG Message {
+  Broadcast URI (b),
+  Track Count (i),
+  Track Descriptors (..)


how do we model different audio/video qualities? 1080p, 720p, etc.

I mentioned in another comment, but the (fMP4) init segment contains the media type and resolution.

jordicenzano

Sorry to be very late to this review. Great progress since last time I took a look.
I really like this direction, now it seems we are 1 step closer to a prototype, EXCITING!!
I left few comments / questions

jordicenzano · 2023-01-17T13:48:24Z

draft-lcurley-warp.md

@@ -151,15 +152,23 @@ Slice:

 Track:

-: An encoded bitstream, representing a single video/audio component that makes up the larger broadcast. See {{tracks}}.
+: An encoded bitstream, representing a single video/audio component that makes up the larger broadcast.


Should we also consider data tracks? (time metadata such as subtitles, SCTE35 markers, etc)

Absolutely.

jordicenzano · 2023-01-17T14:13:09Z

draft-lcurley-warp.md


 Messages SHOULD be sent over the same stream if ordering is desired.
-For example, `PAUSE` and `PLAY` messages SHOULD be sent on the same stream to avoid a race.
+Some messages MUST be sent over the same stream, for example SUBSCRIBE messages ({{message-subscribe}}) with the same broadcast ID.


Did we define brodcast ID? Perhaps Broadcast URI?

jordicenzano · 2023-01-17T14:32:54Z

draft-lcurley-warp.md

+| 0x10 | GOAWAY ({{message-goaway}})       |
+|------|-----------------------------------|
+
+## SETUP {#message-setup}

 The `SETUP` message is the first message that is exchanged by the client and the server; it allows the peers to establish the mutually supported version and agree on the initial configuration. It is a sequence of key-value pairs called *SETUP parameters*; the semantics and the format of individual parameter values MAY depend on what party is sending it.


We could use some default values

jordicenzano · 2023-01-17T14:46:18Z

draft-lcurley-warp.md

+
+~~~
+CATALOG Message {
+  Broadcast URI (b),


I agree with @kpugin , this is how I imagine this working (pretty sure I'm missing something):

Ingest

The client/encoder gets a broadcastURI (or broadcastID) out of band

The client/encoder initiates a webTransport to server, adding that broadcastURI/broadcastID in the SETUP message

The server responds authorized or error

If authorized client / encoder (perhaps sends a SUBSCRIBE announcing the tracks) and starts sending media (or sends SUBSCRIBE + media in parallel while waiting for OK/ERR)

Note: I can NOT picture how CATALOG process works on the ingest case

Delivery

The clients/players gets a broadcastURI (or broadcastID) out of band (like HLS/DASH with manifest URL)

The client initiates a webTransport to server, adding that broadcastURI/broadcastID in the SETUP message, we can also send a CATALOG (or "SUBSCRIBE to defaults" without waiting setup response to avoid 1RTT more, and that will only be processed if SETUP is successful)

The server responds authorized or error

If authorized the server starts sending media (in case "SUBSCRIBE to defaults" was in the same stream)

(if no "SUBSCRIBE to defaults") Client sends CATALOG for that broadcastURI (or broadcastID)

Server responds with CATALOG response

Client SUBSCRIBES to desired variants

The server starts sending media

jordicenzano · 2023-01-17T15:13:58Z

draft-lcurley-warp.md

+
+~~~
+Track Descriptor {
+  Track ID (i),


Is there a way in fmp4 to define track groups? let me add an example of what I mean: We have a live stream with (2 video POV + 2 audio languages. All of them with 2 variants):

Video POV1

Variant1 1080p@30fps - 10Mbps

Variant2 720p@30fps - 6Mbps

Video POV2

Variant1 1080p@30fps - 10Mbps

Variant2 720p@30fps - 6Mbps

Audio POV1 EN

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV1 ES

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV2 EN

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

Audio POV2 ES

Variant1 16b 44.1KHz - 128Kbps

Variant1 16b 44.1KHz - 64Kbps

How the player would know the different POV available, the different languages, and the relation between them. Is fmp4 / mp4 able to store those relations? (basically I'm asking for what transport stream solves with PAT - PMT tables, or other formats with manifests)

Note: I did NOT add subtitles to avoid more complexity

jordicenzano · 2023-01-17T15:15:31Z

draft-lcurley-warp.md

+
+
+Only the most recent SUBSCRIBE message for a broadcast is active.
+SUBSCRIBE messages MUST be sent on the same QUIC stream to preserve ordering.


"be sent on the same QUIC stream" the same as ...? SETUP?

kixelated · 2023-01-19T00:33:34Z

Okay, hitting the merge button and cutting a new draft before I go on vacation. Nothing is final!

wilaw · 2023-01-19T00:43:05Z

draft-lcurley-warp.md


 ## Streams
 Warp endpoints communicate over QUIC streams. Every stream is a sequence of messages, framed as described in {{messages}}.

-The first stream opened is a client-initiated bidirectional stream where the peers exchange SETUP messages ({{setup}}). The subsequent streams MAY be either unidirectional and bidirectional. For exchanging media, an application would typically send a unidirectional stream containing a single OBJECT message ({{object}}).
+The first stream opened is a client-initiated bidirectional stream where the peers exchange SETUP messages ({{message-setup}}). The subsequent streams MAY be either unidirectional and bidirectional. For exchanging media, an application would typically send a unidirectional stream containing a single OBJECT message ({{message-object}}).


The first stream opened is a client-initiated bidirectional stream where the peers exchange SETUP messages ({{message-setup}}).

What should the server do if the first stream opened is not bidirectional? Does it reject the SETUP messages? Since the server could easily open up a unidirectional stream back to the client, it seems this constraint could be relaxed, unless we move to a design in which all CONTROL messages are sent across a single bidi channel, which is a control singleton within the scope of the WT connection.

I believe the usual behavior is that you don't read any stream data on any stream until you find one that starts with a SETUP message.

kixelated requested a review from vasilvv January 9, 2023 21:01

kixelated force-pushed the catalog branch from 355080e to fc5283a Compare January 9, 2023 21:02

kixelated force-pushed the catalog branch from fc5283a to b9d1e5d Compare January 9, 2023 21:02

afrind reviewed Jan 9, 2023

View reviewed changes

kixelated commented Jan 10, 2023

View reviewed changes

kixelated mentioned this pull request Jan 11, 2023

Initial track PUBLISH/SUBSCRIBE #43

Closed

Code review changes.

442f1e0

vasilvv approved these changes Jan 13, 2023

View reviewed changes

kpugin reviewed Jan 13, 2023

View reviewed changes

CR comments.

dd15fd7

jordicenzano reviewed Jan 17, 2023

View reviewed changes

Jordi comments.

5babe2d

kixelated merged commit 2e4909c into main Jan 19, 2023

kixelated deleted the catalog branch January 19, 2023 00:33

wilaw reviewed Jan 19, 2023

View reviewed changes

kixelated mentioned this pull request Jan 27, 2023

Globally unique broadcast URI #70

Closed

kixelated mentioned this pull request Apr 22, 2023

Should PUBLISH be a separate message? #143

Closed

		@@ -427,17 +424,21 @@ Warp Message {

		The Message Length field contains the length of the Message Payload field in bytes.



		Only the most recent SUBSCRIBE message for a broadcast is active.
		SUBSCRIBE messages MUST be sent on the same QUIC stream to preserve ordering.



		## CATALOG {#message-catalog}
		The sender advertises an available broadcast and its tracks via the CATALOG message.


		An initialization segment consists of a File Type Box (ftyp) followed by a Movie Box (moov).
		The "Container Init Payload" in a CATALOG message ({{message-catalog}}) MUST consist of a File Type Box (ftyp) followed by a Movie Box (moov).

Basic catalog and track selection. #63

Basic catalog and track selection. #63

Conversation

kixelated commented Jan 9, 2023

afrind left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated commented Jan 9, 2023

kixelated Jan 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 11, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

kixelated Jan 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 11, 2023 • edited Loading

Choose a reason for hiding this comment

vasilvv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kpugin Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

kixelated Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

kpugin Jan 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ingest

Distribution

kixelated Jan 19, 2023 • edited Loading

kixelated Jan 9, 2023 •

edited

Loading

kixelated Jan 11, 2023 •

edited

Loading

kixelated Jan 10, 2023 •

edited

Loading

kixelated Jan 11, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kixelated Jan 10, 2023 •

edited

Loading

kixelated Jan 11, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kpugin Jan 13, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading

kpugin Jan 13, 2023 •

edited

Loading

kixelated Jan 19, 2023 •

edited

Loading

kixelated Jan 13, 2023 •

edited

Loading