# Video Streaming

Once video is captured and stored on a server, it can be streamed by a user by direct connection (the *client-server* model). Another model of service includes a middle-man of sorts: the video is first pushed to edge caches (servers placed at network edges), and the client loads the video from the closest of these servers (the *content-delivery network* model, used by e.g. Youtube). A *Peer-to-Peer* (P2P) network can also be established for video streaming. In a P2P network, users act as both clients and servers, by simulataneously downloading data from the network and uploading this data to other users. This method of downloading and streaming is highly efficient (leads to major bandwith savings for the distributor), but for streaming in particular this model may be quite sensitive to peer departures (although there are methods to combat this effect). 

Once connection to a server is established, there are a few common techniques for content delivery. The traditional web transfer protocal, *HTTP*, takes a request from a client and returns content in large segments. Imporvements built on this protocol have been developed for video streaming.  *DASH* is one modern video-streaming transfer protocol which segments the file, and transmits the content adaptively.

The quality of a video stream -- measured subjectively -- is affected by the video capture, compression,  transmission, and also by factors on the audience end. We present a non-exhaustive tour of some important features.

** Resolution : number of pixels displayed and/or captured. **   
Commonly,
- 720 x 480 (or 480p) "SD"
- 1280 x 720 (or 720p)"HD"
- 1920 x 1080 (or 1080p) "Full HD"
- 2560 x 1440 (or 1440p) "QHD"
- 3840 x 2160 (or 2160p) "4K"
- 7680 x 4320 (or 4320p) "8K"

** Framerate: rate at which frames are displayed. **
Increasing the framerate (fps) will result in a smoother video at a given resolution. Standards for framerate vary from media to media. A typical computer screen has a framerate of 60fps. Movie theatres use a framerate of 24 fps.

** Bitrate: Size of video file streaming per second. **

Need to ensure that most user's have adequate bandwith. Trade-off between possible lags/buffering and loss of quality. 
Higher resolution not necessarily higher quality: need to take into account monitor of device on which the video will be streamed. Data may be captured at high resolution, but then sampled down for cost/efficiency. This is because higher resolution requires higher bitrate...Naturally, a higher framerate will also require higher bitrate.

Higher motion and detail, for example, both require higher bitrate to maintain quality (note: most streams have variable bitrate for this reason). 

## Video Compression

** Frame-by-Frame encoding ** 
Compress each individual frame, resulting in a series of compressed images. Since no information from other frames is required for compresison, the data can be compressed, transmitted, decompressed and then displayed rapidly (low latency). There is also no risk of invalid frames caused by the algorithm, which is useful for surveillance. The static-image compression algorithms used are generally JPEG or JPEG2000. 

** Temporal encoding **
With temporal encoding, we also consider changes from frame to frame, attempting to only store the changes between frames. This can lead to more savings than frame-by-frame encoding, but runs a risk of missing information. Standard codecs (enocding/decoding algorithms) are H.264 and MPEG-4.

In the original MPEG-4, we encode frames as a *Group of Pictures (GOP)* of three frame type: Intra(I)-frames, Predicted(P)-frames and Bidirectional(B)-frames. 
- I-Frames are complete encoded images (can be decoded without reference to other frames). Needed as starting points, reference points if stream is damaged, and for "random access" functions (e.g. fast-forwarding).  They contain no artifacts, but are, on the other hand, more expensive. 
- P-Frames are coded with reference to an earlier I- or P-frame. This reference is in general complex and, as such, they are sensitive to transmission errors. They require fewer bits than I-frames.
- B-Frames contain information on the changes between previous and following frames. Lower latency can be achieved without using B-frames.

This codec is used in low-resolution cameras.

H.264 is an extension on MPEG-4. Additonal features include motion compensation (used in P- and B-frames): motion vectors are used to describe relative movement of an object from a reference frame. Computing the motion vector takes up fewer bits than if the whole content were to be coded (althout it is, in general, demanding). H.264 can also reduce the size of I-frames. Generally seen as the "highest standard" for video compression, on average, we can reduce bitrate by ~50% for fixed quality (less bandwith and storage space needed: higher quality at a lower bitrate!). High resolution cameras are needed, and results in higher latency.

In our data set (provided by Roger Donaldson of Midvale Applied Mathematics), we are given data on various camera, scene and transmission data, but no user data. 


# A first look at the data

Our goal will be to create a model which predicts bitrate given various camera and scene features.  Bitrate measurements were taken after capturing and compressing video under different test conditions (listed below). The data was provided by Roger Donaldson (Midvale Applied Mathematics).

### Camera Features (from 8 different cameras)
- Flicker: camera cuts out light flicker, depending on the region
- KeyFrame: number of P- and B-frames between I-frames (minus 1)
- ImageRate: framerate (fps)
- Quality: compression control, low number is best quality
- Nonlinear: single or multi-exposure frames (HDR)
- Mode: high performance or standard modes
- Compression: additional compression mode
- KpbsLimit: user set maximum bitrate
- Primary Resolution: highest resolution stream
- Secondary Resolution: lower resolution stream
- Tertiary Resolution: lowest resolution stream

### Scene Features
- Test: type of test (Base, Idle, Compression, or HDR)
- Motion: amount of motion in scene, classified as high, medium or none
- Detail: the amount of detial in scene, classified as high, medium, or low

### Measurements
- CollectSeconds: time of data collection
- WaitSeconds: time between measurements
- TotalBytes: total number of bytes transmitted to the network
- Primary Bitrate: mean bitrate for primary stream
- Secondary Bitrate: mean bitrate for secondary stream
- Tertiary Bitrate: mean bitrate for tertiary stream

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

# First load date from cameras
A3data=pd.read_csv('../data/A3.csv')
#D16data=pd.read_csv('../data/D16.csv')
print(A3data.head())

In [None]:
# Convert resolution to number (log scale)
def res_to_number(r):
    s = np.zeros(r.shape[0])
    for i in range(len(s)):
        ri = r[i]
        xloc = str.find(ri,'x')
        s[i] = int(r[i][:xloc])*int(r[i][xloc+1:])
    return s  

PrimRes_num_A3 = np.log10(res_to_number(A3data['PrimaryResolution'].values))
SecRes_num_A3 = np.log10(res_to_number(A3data['SecondaryResolution'].values))


# Convert motion  and detail to number
def categ_ordered_to_num(r,vals):
    p = len(vals)
    s = np.zeros(len(r))
    for i in range(len(r)):
        ri = r[i]
        idx = vals.index(ri)
        s[i] = idx
    return s

Detail_num_A3 = categ_ordered_to_num(A3data['Detail'].values,['low','medium','high'])
Motion_num_A3 = categ_ordered_to_num(A3data['Motion'].values,['none','low','high'])



In [None]:
# We expect motion and resolution to be important factors. Look at the relationship between bitrate 
# and primary resolution for high,med,low motion
Y = np.log10(A3data['PrimaryBitsPerSecond'].values)

fig = plt.figure(figsize=(17,5))
fig.subplots_adjust(hspace=.5)

plt.subplot(2,2,1)
plt.scatter(PrimRes_num_A3,Y, c=Motion_num_A3, alpha=0.2)
plt.colorbar()
plt.ylabel('Primary Bitrate')
plt.title('motion')

plt.subplot(2,2,2)
plt.scatter(PrimRes_num_A3,Y, c=Detail_num_A3, alpha=0.2)
plt.colorbar()
plt.ylabel('Primary Bitrate')
plt.title('detail')


plt.subplot(2,2,3)
plt.scatter(PrimRes_num_A3, Y, c=A3data['ImageRate'].values,alpha=0.2)
plt.colorbar()
plt.xlabel('Primary Resolution')
plt.ylabel('Primary Bitrate')
plt.title('frame rate')


plt.subplot(2,2,4)
plt.scatter(PrimRes_num_A3, Y, c=A3data['Quality'].values,alpha=0.2)
plt.colorbar()
plt.xlabel('Primary Resolution')
plt.ylabel('Primary Bitrate')
plt.title('quality')


### Sources
http://www.lcis.com.tw/paper_store/paper_store/DASH-IEEE-multimedia-preprint-201622602748584.pdf
http://www4.comp.polyu.edu.hk/~oneprobe/doc/im2011-qoe.pdf
http://www.itc23.com/fileadmin/ITC23_files/papers/a5.pdf
http://ieeexplore.ieee.org.proxy.lib.sfu.ca/stamp/stamp.jsp?arnumber=7222404
http://avigilon.com/assets/pdf/WhitePaperCompressionTechnologiesforHD.pdf
https://www.axis.com/files/whitepaper/wp_h264_31669_en_0803_lo.pdf