# Rate-Distortion Optimization I
*Also check out [Part II](https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1969271421694072/4057322776779238/5612335034456173/latest.html) and [Part III](https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1969271421694072/789288020732031/5612335034456173/latest.html)*

More users stream video from mobile devices at the same time screen sizes are increasing. 

Regardless, users expect high quality video without distortion artifacts or lag.

Consequently, many companies invest heavily into optimizing video compression and streaming. 

* [Netflix](https://netflixtechblog.com/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f)
* [Twitter](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2020/introducing-vmaf-percentiles-for-video-quality-measurements.html)
* [Facebook](https://www.youtube.com/watch?v=hKHtGTRdtjI)
* [YouTube](https://youtube-eng.googleblog.com/2016/05/machine-learning-for-video-transcoding.html)

We can exhaustively encode video over a lattice of spatial resolutions & bitrates, plotting against quality or distortion metrics.

<img src="https://miro.medium.com/max/2228/1*yhcFOZvMb-oq51ADMKSXrQ.png" alt="drawing" width="500"/>

Common choices for reference metric include MOS, PSNR, SSIM and a fusion model developed to regress MOS from image features: [VMAF](https://github.com/Netflix/vmaf).

<img src="https://miro.medium.com/max/2228/1*1Q3Xx7CDywwdVbaLlpnRCg.png" alt="drawing" width="500"/>

Then via convex hull optimization, we can determine the bitrate/resolution combinations which maximize perceived quality for a given video.

The perceived quality of a video varying quickly in space and time could be more adversely impacted by low bitrates.

<img src="https://miro.medium.com/max/3150/0*8WmTeqaDW5tB7jGZ" width="535"/>

Using scene detection, video can be segmented into GOPs with relatively homogeneous image content for greater optimizations to the bitrate ladder.

<img src="https://miro.medium.com/max/1474/0*JOxSte08VHgwYWBP." width="535">

However, encoding video segments is computationally demanding. 

Recent work on this problem aims to reduce computations by leveraging content info.

Here, we show how to combine MapReduce and FFmpeg to optimize video encoding. 

To run in a Databricks cluster, be sure to attach the following init_script. 

On a community cluster, run the commands using the Terminal via the cluster's apps tab.

```bash
#!/bin/bash
sudo apt-get update

cd /home/ubuntu/
mkdir ffmpeg
wget https://johnvansickle.com/ffmpeg/builds/ffmpeg-git-amd64-static.tar.xz
tar xvf ffmpeg-git-amd64-static.tar.xz -C ffmpeg --strip-components 1
sudo cp -r /home/ubuntu/ffmpeg/model /usr/local/share/
```

In [0]:
import subprocess
import numpy as np
from scipy.optimize import curve_fit

import pyspark.sql.types as T
import pyspark.sql.functions as F

Here, we have a sample video with various resolutions and bitrates to evaluate.

In [0]:
bitrates = spark.createDataFrame([
    ("2560:1080", [1500, 2000, 2500]), 
    ("1920:1080", [500, 1000, 1500, 2000, 2500]), 
    ("1280:720", [200, 400, 600, 800, 1000, 1200]), 
    ("640:480", [100, 200, 300, 400]), 
    ("480:360", [100, 200, 300, 400])
  ],
  ["resolution", "bitrate"],
)

df = spark.createDataFrame(
  [
    ["https://smellslike.ml/img/pexels_example_4k.mp4"],
  ],
  ["video"],
)

df = df.join(bitrates).withColumn("bitrate", F.explode("bitrate"))
display(df)

video,resolution,bitrate
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,1500
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,2000
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,2500
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,500
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,1000
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,1500
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,2000
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,2500
https://smellslike.ml/img/pexels_example_4k.mp4,1280:720,200
https://smellslike.ml/img/pexels_example_4k.mp4,1280:720,400


We calculate the VMAF score for all the configurations with a custom udf.

In [0]:
@udf(returnType=T.DoubleType())
def rate_distortion(video, bitrate, resolution):
    width, height = map(int, resolution.split(":"))
    cmd = '/home/ubuntu/ffmpeg/ffmpeg -i {} -vf scale={}:{} -c:v libx264 -tune psnr -x264-params vbv-maxrate={}:vbv-bufsize={} -f rawvideo -f rawvideo pipe: | /home/ubuntu/ffmpeg/ffmpeg -i pipe: -i {} -filter_complex "[0:v]scale=1920x1080:flags=bicubic[main]; [1:v]scale=1920x1080:flags=bicubic,format=pix_fmts=yuv420p,fps=fps=30/1[ref]; [main][ref]libvmaf=psnr=true:log_path=vmaflog.json:log_fmt=json" -f null - '.format(video, width, height, bitrate, bitrate, video)
    ps = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    output = ps.communicate()[0]
    vmaf = float(str(output).split("VMAF score: ")[-1].split('\\n"')[0])
    return vmaf

In [0]:
df = df.repartition(df.count())
df = df.withColumn("vmaf", rate_distortion(F.col("video"), 
                                           F.col("bitrate"), 
                                           F.col("resolution")))
display(df)

video,resolution,bitrate,vmaf
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,2500,60.431787
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,2500,60.397404
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,2000,59.100511
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,1500,56.687234
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,1500,57.225517
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,2000,59.560035
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,500,40.822493
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,1000,52.699253
https://smellslike.ml/img/pexels_example_4k.mp4,640:480,300,37.523325
https://smellslike.ml/img/pexels_example_4k.mp4,1280:720,1000,53.49862


In [0]:
df = (
  df.orderBy("bitrate").groupBy("video", "resolution")
    .agg(F.collect_list("bitrate").alias("bitrate"), 
         F.collect_list("vmaf").alias("vmaf")
        )
)
display(df)

video,resolution,bitrate,vmaf
https://smellslike.ml/img/pexels_example_4k.mp4,1280:720,"List(200, 400, 600, 800, 1000, 1200)","List(30.097029, 42.064831, 47.803116, 51.341108, 53.713008, 55.027662)"
https://smellslike.ml/img/pexels_example_4k.mp4,640:480,"List(100, 200, 300, 400)","List(22.540848, 32.437474, 37.627749, 40.318408)"
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,"List(1500, 2000, 2500)","List(56.675947, 59.4976, 61.316153)"
https://smellslike.ml/img/pexels_example_4k.mp4,480:360,"List(100, 200, 300, 400)","List(21.635275, 29.393925, 31.897668, 32.782104)"
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,"List(500, 1000, 1500, 2000, 2500)","List(41.037517, 52.842782, 57.262829, 59.933369, 61.488298)"


Generally, the rate-distortion curves grow logarithmically. Using `scipy.optimize.curve_fit` to regress parameters for such a model, we extrapolate VMAFs across the full range of bitrates.

In [0]:
@udf(returnType=T.ArrayType(T.FloatType()))
def interpolate_vmaf(bitrates, vmafs, low_br=100, high_br=3000, num_samples=50):
  def log_fit(x, a, b, c):
    return a * np.log(x + b) + c
  popt, pcov = curve_fit(log_fit, bitrates, vmafs, maxfev=5000)
  xnew = np.arange(low_br, high_br, num_samples)
  return log_fit(xnew, *popt).tolist()
  
df = df.orderBy("resolution").withColumn("estimated_vmaf", interpolate_vmaf(F.col("bitrate"), F.col("vmaf")))
display(df)

video,resolution,bitrate,vmaf,vmaf_interp,estimated_vmaf
https://smellslike.ml/img/pexels_example_4k.mp4,1280:720,"List(200, 400, 600, 800, 1000, 1200)","List(30.043221, 41.862098, 47.780471, 51.467187, 53.69398, 55.066659)","List(NaN, 20.528948, 29.99733, 34.738102, 37.92654, 40.332237, 42.264744, 43.87996, 45.267525, 46.483757, 47.566353, 48.541798, 49.4294, 50.243687, 50.99585, 51.694695, 52.34729, 52.959377, 53.53569, 54.08018, 54.596184, 55.08653, 55.55365, 55.999645, 56.426346, 56.83535, 57.22807, 57.60575, 57.9695, 58.320312, 58.659077, 58.98659, 59.303577, 59.610695, 59.90854, 60.197655, 60.478535, 60.751637, 61.017384, 61.276154, 61.528313, 61.774185, 62.014076, 62.248272, 62.477036, 62.700615, 62.91924, 63.13312, 63.342464, 63.54746, 63.748276, 63.945087, 64.138054, 64.32731, 64.51301, 64.695274, 64.87423, 65.05)","List(NaN, 20.528948, 29.99733, 34.738102, 37.92654, 40.332237, 42.264744, 43.87996, 45.267525, 46.483757, 47.566353, 48.541798, 49.4294, 50.243687, 50.99585, 51.694695, 52.34729, 52.959377, 53.53569, 54.08018, 54.596184, 55.08653, 55.55365, 55.999645, 56.426346, 56.83535, 57.22807, 57.60575, 57.9695, 58.320312, 58.659077, 58.98659, 59.303577, 59.610695, 59.90854, 60.197655, 60.478535, 60.751637, 61.017384, 61.276154, 61.528313, 61.774185, 62.014076, 62.248272, 62.477036, 62.700615, 62.91924, 63.13312, 63.342464, 63.54746, 63.748276, 63.945087, 64.138054, 64.32731, 64.51301, 64.695274, 64.87423, 65.05)"
https://smellslike.ml/img/pexels_example_4k.mp4,1920:1080,"List(500, 1000, 1500, 2000, 2500)","List(40.72994, 52.667871, 57.50965, 59.424357, 61.518411)","List(NaN, NaN, NaN, NaN, NaN, NaN, 20.329319, 36.259724, 40.720966, 43.417347, 45.35577, 46.87017, 48.11316, 49.167397, 50.08272, 50.891514, 51.616005, 52.272125, 52.871674, 53.423634, 53.93501, 54.41136, 54.85718, 55.27615, 55.671314, 56.04524, 56.40009, 56.737717, 57.05972, 57.367477, 57.662197, 57.94494, 58.216637, 58.478123, 58.73014, 58.973347, 59.20834, 59.435654, 59.655773, 59.869144, 60.076168, 60.27721, 60.472603, 60.662663, 60.847664, 61.027878, 61.20354, 61.374874, 61.542095, 61.705387, 61.864937, 62.020912, 62.173462, 62.322742, 62.468887, 62.612026, 62.752277, 62.88976)","List(NaN, NaN, NaN, NaN, NaN, NaN, 20.329319, 36.259724, 40.720966, 43.417347, 45.35577, 46.87017, 48.11316, 49.167397, 50.08272, 50.891514, 51.616005, 52.272125, 52.871674, 53.423634, 53.93501, 54.41136, 54.85718, 55.27615, 55.671314, 56.04524, 56.40009, 56.737717, 57.05972, 57.367477, 57.662197, 57.94494, 58.216637, 58.478123, 58.73014, 58.973347, 59.20834, 59.435654, 59.655773, 59.869144, 60.076168, 60.27721, 60.472603, 60.662663, 60.847664, 61.027878, 61.20354, 61.374874, 61.542095, 61.705387, 61.864937, 62.020912, 62.173462, 62.322742, 62.468887, 62.612026, 62.752277, 62.88976)"
https://smellslike.ml/img/pexels_example_4k.mp4,2560:1080,"List(1500, 2000, 2500)","List(56.55828, 59.525395, 61.293999)","List(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 43.913292, 48.167397, 50.308235, 51.750267, 52.83912, 53.71419, 54.445812, 55.07445, 55.62556, 56.11618, 56.55828, 56.9606, 57.32972, 57.670696, 57.98752, 58.283386, 58.560898, 58.8222, 59.06908, 59.30305, 59.525394, 59.73721, 59.939453, 60.132946, 60.31842, 60.49651, 60.667786, 60.832745, 60.991837, 61.14547, 61.294, 61.43776, 61.57704, 61.71212, 61.843243, 61.97063, 62.09449, 62.215015, 62.33238, 62.446743)","List(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 43.913292, 48.167397, 50.308235, 51.750267, 52.83912, 53.71419, 54.445812, 55.07445, 55.62556, 56.11618, 56.55828, 56.9606, 57.32972, 57.670696, 57.98752, 58.283386, 58.560898, 58.8222, 59.06908, 59.30305, 59.525394, 59.73721, 59.939453, 60.132946, 60.31842, 60.49651, 60.667786, 60.832745, 60.991837, 61.14547, 61.294, 61.43776, 61.57704, 61.71212, 61.843243, 61.97063, 62.09449, 62.215015, 62.33238, 62.446743)"
https://smellslike.ml/img/pexels_example_4k.mp4,480:360,"List(100, 200, 300, 400)","List(21.707866, 29.411947, 32.032339, 32.949129)","List(21.705008, 27.452772, 29.524036, 30.81696, 31.759262, 32.5011, 33.112976, 33.633713, 34.086975, 34.48826, 34.84827, 35.17471, 35.47331, 35.748447, 36.003544, 36.241318, 36.463978, 36.67333, 36.870872, 37.05787, 37.23539, 37.404346, 37.565533, 37.719624, 37.867226, 38.008858, 38.144985, 38.276024, 38.402336, 38.524254, 38.64207, 38.756054, 38.866444, 38.973465, 39.077312, 39.17817, 39.276203, 39.37157, 39.46441, 39.55485, 39.643017, 39.72902, 39.81296, 39.89494, 39.97504, 40.053352, 40.129955, 40.204914, 40.278305, 40.350193, 40.42063, 40.48968, 40.5574, 40.623833, 40.68903, 40.753033, 40.81589, 40.87764)","List(21.705008, 27.452772, 29.524036, 30.81696, 31.759262, 32.5011, 33.112976, 33.633713, 34.086975, 34.48826, 34.84827, 35.17471, 35.47331, 35.748447, 36.003544, 36.241318, 36.463978, 36.67333, 36.870872, 37.05787, 37.23539, 37.404346, 37.565533, 37.719624, 37.867226, 38.008858, 38.144985, 38.276024, 38.402336, 38.524254, 38.64207, 38.756054, 38.866444, 38.973465, 39.077312, 39.17817, 39.276203, 39.37157, 39.46441, 39.55485, 39.643017, 39.72902, 39.81296, 39.89494, 39.97504, 40.053352, 40.129955, 40.204914, 40.278305, 40.350193, 40.42063, 40.48968, 40.5574, 40.623833, 40.68903, 40.753033, 40.81589, 40.87764)"
https://smellslike.ml/img/pexels_example_4k.mp4,640:480,"List(100, 200, 300, 400)","List(22.733081, 32.336134, 37.58991, 40.416253)","List(22.718735, 28.745735, 32.477406, 35.18753, 37.31686, 39.070923, 40.562378, 41.85972, 43.0077, 44.037193, 44.97038, 45.823753, 46.609894, 47.338623, 48.017765, 48.653637, 49.251423, 49.81543, 50.34927, 50.85601, 51.33827, 51.7983, 52.238068, 52.65928, 53.063435, 53.451866, 53.82575, 54.186134, 54.533966, 54.870083, 55.19525, 55.510166, 55.81545, 56.11167, 56.39936, 56.67899, 56.951, 57.215797, 57.47375, 57.72521, 57.970486, 58.209885, 58.443676, 58.672115, 58.895447, 59.11389, 59.32766, 59.536945, 59.741936, 59.9428, 60.1397, 60.332798, 60.52223, 60.708134, 60.890636, 61.069866, 61.245934, 61.418953)","List(22.718735, 28.745735, 32.477406, 35.18753, 37.31686, 39.070923, 40.562378, 41.85972, 43.0077, 44.037193, 44.97038, 45.823753, 46.609894, 47.338623, 48.017765, 48.653637, 49.251423, 49.81543, 50.34927, 50.85601, 51.33827, 51.7983, 52.238068, 52.65928, 53.063435, 53.451866, 53.82575, 54.186134, 54.533966, 54.870083, 55.19525, 55.510166, 55.81545, 56.11167, 56.39936, 56.67899, 56.951, 57.215797, 57.47375, 57.72521, 57.970486, 58.209885, 58.443676, 58.672115, 58.895447, 59.11389, 59.32766, 59.536945, 59.741936, 59.9428, 60.1397, 60.332798, 60.52223, 60.708134, 60.890636, 61.069866, 61.245934, 61.418953)"


With an argmax over VMAF arrays, we can index the perceptually optimal resolution for a given bitrate.

In [0]:
d = df.toPandas()

d = d[["video", "estimated_vmaf"]].groupby("video", as_index=False).agg(list)
d["optimal_encoding"] = d["estimated_vmaf"].map(lambda x: np.argmax(np.nan_to_num(np.array(x)), axis=0).tolist())

print(d.optimal_encoding.values)

The `optimal_encoding` indicates the resolutions over a range of bitrates which maximize percieved quality according to VMAF.