Skip to content

Low Latency streaming

dsilhavy edited this page Dec 9, 2020 · 44 revisions

Low Latency Streaming

One of the major challenges in OTT streaming is reducing the live streaming latency. This can be crucial for live events like sport games or for an optimal streamer-user interaction in eSports games.

Use case: On Par with Other Distribution Means

A live event is distributed over DASH as well over regular TV distribution. The event should play-out approximately at the same time on both devices in order to avoid different perceptions of the same service when received over different distribution means. The objective should be to get to a range of delay for the DASH based service that is equivalent to cable and IPTV services [1].

Use case: Sports Bar

Sports bars are commonly in close proximity to each other and may all show the same live sporting event. Some bars may be using a provider which distributes the content using DVB-T or DVB-S services whilst others may be using DASH ABR. Viewers in a bar with a high latency will have their viewing spoiled as they will hear cheers for the goal before it occurs on their local screen.

This creates a commercial incentive for the bar operator to switch to the provider with the lowest latency. The objective should be to get the latency range to not be perceptibly different to that of a DVB broadcast solution for those users who have a sufficient quality (high and consistent speed) connection [1].

Use case: Professional streamer with interactive chat

Professional streamers interacting with a live audience on social media, often via a directly coupled chat function in the viewing app/environment. They can generate direct revenue in several ways including:

  • In stream advertising
  • Micropayments (for example Twitch “bits”)

A high degree of interactivity between the performer and the audience is required to enable engagement. Lower latencies increases the engagement and consequently the incentive for the audience members to reward the performer with likes, shares, subscribes, micropayments, etc.

Typical use cases include gamers, musicians and other performers where in some part the direction of the performance can be guided by the audience response [1].

Use case: Sports betting

A provider wants to offer a live stream that will be used for wagering within an event. The content must be delivered with low latency and more importantly within a well-defined sync across endpoints so customers trust the game is fair. There are in some cases legal considerations, for example the content cannot be shown if it is more than X seconds behind live.

Visual and aural quality are secondary in priority in these scenarios to sync and latency. The lower the latency the more opportunities for “in play betting” within the game/event. This in turn increases revenue potential from a game/event [1].

CMAF low latency streaming

The Common Media Application Format introduces the concept of "chunks". A CMAF chunk has multiple "moof" and "mdat" boxes, allowing the client to access the media data before the segment is completely finished. The benefits of the chunked mode become more obvious when looking at a concrete example:

So let’s assume we have 8 second segments and we are currently 3 seconds into segment number four. For classic media segments, this leaves us with two options:

  • Option 1: since segment four is not completed, we start with segment three. That way, we end up 11 seconds behind the live edge – 8 seconds coming from segment three, and 3 seconds coming from segment four.
  • Option 2: we wait for segment four to finish and immediately start downloading and playing it. We end up with 8 seconds of latency and a waiting time of 5 seconds.

With CMAF chunks, on the other hand, we are able to play segment four before it is completely available. In the example above, we have CMAF chunks with a 1 second duration, which leads to eight chunks per segment. Let’s assume that only the first chunk contains an IDR frame and therefore we always need to start the playback from the beginning of a segment. Being three seconds into segment four leaves us with 3 seconds of latency. That’s much better than what we achieved with classic segments. We can also fast decode the first chunks and play even closer to the live edge [2].

CMAF low latency streaming with dash.js

dash.js supports CMAF low latency streaming since version 2.6.8. For that reason, multiple samples are available:

dash.js configuration

The following Sections below will give a detailed explanation on L2ALL and LoL+. Some parameters are valid for all low latency algorithms:

Parameter Description
lowLatencyEnabled Enable low latency mode
liveDelay Lowering this value will lower latency but may decrease the player's ability to build a stable buffer.
minDrift Minimum latency deviation allowed before activating catch-up mechanism.
maxDrift Maximum latency deviation allowed before dash.js to do a seeking to live position
playbackRate Maximum catch-up rate, as a percentage, for low latency live streams.
latencyThreshold The maximum threshold for which live catch up is applied. For instance, if this value is set to 8 seconds, then live catchup is only applied if the current live latency is equal or below 8 seconds. The reason behind this parameter is to avoid an increase of the playback rate if the user seeks within the DVR window.

The corresponding API call looks the following:

player.updateSettings({
    streaming: {
         lowLatencyEnabled: true, 
         liveDelay: 4,
         liveCatchup: {
            minDrift: 0.02,
            maxDrift: 0,
            playbackRate: 0.5,
            latencyThreshold: 60    
          }
    }
});

Please check the API documentation for additional information.

dash.js requirements

In order to use dash.js in low latency mode the following requirements have to be fullfilled:

Client requirements

Server and content requirements

The content and the manifest must be conditioned to support CMAF low latency chunks

The manifest must contain two additional attributes

  • @availabilityTimeComplete: specifies if all segments of all associated representations are complete at the adjusted availability start time. If the value is set to false, then it may be inferred by the client that the segment is available at its announced location prior to completion.
  • @availabilityTimeOffset (ATO): provides the time in how much earlier segments are available compared to their computed availability start time (AST).

The segments must contain multiple CMAF chunks. This will result in multiple "moof" and "mdat" boxes per segment. Example

[styp] size=8+16
[prft] size=8+24
[moof] size=8+96
  [mfhd] size=12+4
    sequence number = 827234
  [traf] size=8+72
    [tfhd] size=12+16, flags=20038
      track ID = 1
      default sample duration = 1001
      default sample size = 15704
      default sample flags = 1010000
    [tfdt] size=12+8, version=1
      base media decode time = 828060233
    [trun] size=12+12, flags=5
      sample count = 1
      data offset = 112
      first sample flags = 2000000
[mdat] size=8+15704
[prft] size=8+24
[moof] size=8+92
  [mfhd] size=12+4
    sequence number = 827235
  [traf] size=8+68
    [tfhd] size=12+16, flags=20038
      track ID = 1
      default sample duration = 1001
      default sample size = 897
      default sample flags = 1010000
    [tfdt] size=12+8, version=1
      base media decode time = 828061234
    [trun] size=12+8, flags=1
      sample count = 1
      data offset = 108
[mdat] size=8+897
[prft] size=8+24
[moof] size=8+92
  [mfhd] size=12+4
    sequence number = 827236
  [traf] size=8+68
    [tfhd] size=12+16, flags=20038
      track ID = 1
      default sample duration = 1001
      default sample size = 7426
      default sample flags = 1010000
    [tfdt] size=12+8, version=1
      base media decode time = 828062235
    [trun] size=12+8, flags=1
      sample count = 1
      data offset = 108
[mdat] size=8+7426

Challenges in low latency streaming

Compared to ABR algorithms for "classic" live streaming an ABR algorithm for low latency streaming has to overcome additional challenges.

Challenge 1: Throughput estimation

Common throughput based ABR algorithms calculate the available bandwidth on the client side using the download time for a segment:

Calculated Throughput = (Segment@Bitrate*Segment@duration) / DownloadTime

Example: 

Calculated Throughput = (6Mbit/s * 6s) / 3s = 12 Mbit/s

The concept described above is a problem for clients operating in low latency mode. Since segments are transferred via HTTP 1.1 Chunked transfer encoding the download time a segment is similar to its duration. The download of a segment is started prior to its completion. Therefore, the data arrives in small chunks at the client side .

For instance, the download time for a segment with six second duration will be approximately six seconds. There will be idle times in which no data is transferred from the server to the client. However, the connection remains open while the client waits for new data. The total download time includes these idle times. Consequently, the total download time is not a good indicator for the available bandwidth on the client side.

Low latency throughput estimation in dash.js

dash.js offers two different modes for low latency throughput estimation

Default throughput estimation

For every segment that is downloaded the default algorithm saves the timestamp and the length of bytes received throughout the download process. The data packets do not arrive at moof boundaries. For instance a single "data burst" might contain multiple moof/mdat pairs. For every data point an entry in the corresponding array is created:

 downloadedData.push({
    ts: Date.now(), // timestamp when the data arrived
    bytes: value.length // length of the data
});

After the download of a segment is completed, the array above is cleared and the throughput is calculated in the following way:

    function _calculateDownloadedTimeByBytesReceived(downloadedData, bytesReceived) {
            downloadedData = downloadedData.filter(data => data.bytes > ((bytesReceived / 4) / downloadedData.length));
            if (downloadedData.length > 1) {
                let time = 0;
                const avgTimeDistance = (downloadedData[downloadedData.length - 1].ts - downloadedData[0].ts) / downloadedData.length;
                downloadedData.forEach((data, index) => {
                    // To be counted the data has to be over a threshold
                    const next = downloadedData[index + 1];
                    if (next) {
                        const distance = next.ts - data.ts;
                        time += distance < avgTimeDistance ? distance : 0;
                    }
                });
                return time;
            }
    }
  1. In the first step the downloadedData array is filtered and all entries that do not have a certain size are removed.
  2. In the next step the average time distance between two consecutive data points is calculated
  3. If time distance between two consecutive data points is smaller than the average time distance the time distance is added to the total download time
  4. The total download time is used to calculate the throughput as described before. Using this approach the download time is no longer equal to the duration of the segment.

Moof based throughput estimation

In contrast to the default throughput algorithm, the moof based throughput estimation is based on saving the download time for each CMAF chunk. For that reason, the start and the endtime of each chunk, starting with a moof box and ending with an mdat box are saved:

// Store the start time of each chunk download                             
const flag1 = boxParser.parsePayload(['moof'], remaining, offset);
if (flag1.found) {
   // Store the beginning time of each chunk download 
   startTimeData.push({
       ts: performance.now(),
       bytes: value.length
   });
}

const boxesInfo = boxParser.findLastTopIsoBoxCompleted(['moov', 'mdat'], remaining, offset);
      if (boxesInfo.found) {
         const end = boxesInfo.lastCompletedOffset + boxesInfo.size;

         // Store the end time of each chunk download 
         endTimeData.push({
             ts: performance.now(), 
             bytes: remaining.length
         });
}

The download time of the segment is calculated the following way:

    function _calculateDownloadedTimeByMoofParsing(startTimeData, endTimeData) {
            let datum, datumE;
            // Filter the first and last chunks in a segment in both arrays [StartTimeData and EndTimeData]
            datum = startTimeData.filter((data, i) => i > 0 && i < startTimeData.length - 1);
            datumE = endTimeData.filter((dataE, i) => i > 0 && i < endTimeData.length - 1);
            // Compute the download time of a segment based on the filtered data [last chunk end time - first chunk beginning time]
            if (datum.length > 1) {
                let segDownloadtime = datumE[datumE.length - 1].ts - datum[0].ts;
                return segDownloadtime;
            }
    }

dash.js configuration

The desired throughput calculation mode can be selected by changing the respective settings parameter:

Value Mode
ABR_FETCH_THROUGHPUT_CALCULATION_DOWNLOADED_DATA Moof based throughput estimation
ABR_FETCH_THROUGHPUT_CALCULATION_MOOF_PARSING Default throughput estimation
player.updateSettings({
  streaming: {
     abr: {
     fetchThroughputCalculationMode: Constants.ABR_FETCH_THROUGHPUT_CALCULATION_DOWNLOADED_DATA
     }
  }
})

Challenge 2: Maintaining a consistent live edge

When playing in low latency mode the client needs to maintain a consistent live edge allowing only small deviations compared to the target latency.

Maintaining a consistent live edge in dash.js

dash.js configuration

Low latency ABR algorithms in dash.js

LoL+

TBD

Tuning parameters

TBD

dash.js configuration

L2A

TBD

Tuning parameters

TDB

dash.js configuration

Test streams

Test results

Spreadsheet

Material

Articles

Videos

Bibliography

Clone this wiki locally