<h1>Deep Encode Group 2 Lableing</h1>

labeling pipeline:
1. define minimum vmaf score

2. obtain meta info of video and store in some useful format

3. split into scenes (nominally mp4, but actually y4m)

4. rename to y4m

5. **compress**

6. **calc vmaf compared to original scene**

**bold**: repeat until acceptable, subject to compression strategy


In [185]:
import os
import subprocess
import pandas as pd
import numpy as np
import re
import math
import time
#import MediaInfo
import shutil


In [186]:
### PARAMETERS ###

# define minimum acceptable vmaf
minimum_acceptable_vmaf = 92.0

# observed optimal bitrates for vmaf 92.0: 22, 10, 11

# bitrates from 1 to 64 MB/s possible
bitrate_candidates = np.array(range(1, 65))


### LOCATIONS ###

input_vid_name='blue_sky_1080p25'

abs_path='/Volumes/T7/deep_encode_dataset/DATASET_DEEP_ENCODE_2'

# relative path to input video
#input_vid_loc=f'{abs_path}/{input_vid_name}/{input_vid_name}.y4m'

# path to split uncompressed scenes
orig_scenes_loc=f'{abs_path}/{input_vid_name}/orig_scenes'

# path to encodes (will be deleted after use)
encode_loc=f'{abs_path}/{input_vid_name}/encodes'

# path to extracted labels
label_loc=f'{abs_path}/{input_vid_name}/labels'


### INPUT VIDEO META DATA ###

input_vid_res = '1920x1080'
#input_vid_res = '1280x720'

#input_vid_fps = '50'
#input_vid_fps = '30'
input_vid_fps = '25'


input_vid_pix_fmt = 'yuv420p'
#input_vid_pix_fmt = 'yuv422p'

In [187]:
# HELPER METHODS

def extract_vmaf(result_string):
    # Extract the number using regex
    match = re.search(r"VMAF score: (\d+\.\d+)", result_string)

    # Check if a match is found and get the number
    if match:
        vmaf_score = float(match.group(1))
        return vmaf_score 
    else:
        print("VMAF score not found in the string")
        return 101.0

def delete_encode(path):
    if os.path.isfile(path):
        # Delete the file
        os.remove(path)
        print(f"The file at {path} has been deleted.")
    else:
        print(f"No file found at {path}.")

In [188]:
#! ffprobe -v error -show_streams {orig_scenes_loc}/{input_vid_name}-Scene-001.y4m

In [189]:
# split scenes bases on content detection
# in this command no additional info about the input video is provided

#! scenedetect -i {input_vid_loc} -o {orig_scenes_loc} detect-content split-video -a "-map 0 -c:v copy -c:a copy -f rawvideo -vcodec rawvideo"

In [190]:
# rename to y4m to correctly refelct data format

def change_file_suffix(directory):
    for filename in os.listdir(directory):
        if filename.endswith('.mp4'):
            new_filename = os.path.splitext(filename)[0] + '.y4m'
            old_path = os.path.join(directory, filename)
            new_path = os.path.join(directory, new_filename)
            os.rename(old_path, new_path)
            print(f'Renamed {filename} to {new_filename}')

#change_file_suffix(orig_scenes_loc)


<h2>Encoding strategy:</h2>

We use ffmpeg's libx264 at preset ultrafast.

Rate Control Mode is **Two Pass average bitrate (ABR)** -b:v, which results in a guaranteed average bitrate over the scene. At the same time the encoder is still able to adapt to different 'complexities' within the scene. Results are same as in 'capped CRF'(see: https://slhck.info/video/2017/03/01/rate-control.html). We will not set -maxrate or -bufsize in this mode.

(Note, bc otherwise you will get confused:  -maxrate in combination with crf means the maximum bitrate of the entire video that is acceptable.
-maxrate in association with -b:v is the maximum acceptale "abweichung" from average bitrate!)

~~We aim for an encoing that at highest settings encodes at "-crf 17". This constant rate factor is also recommended (by https://trac.ffmpeg.org/wiki/Encode/H.264) for visually lossless compression at highest compression rate.~~

~~This constant rate factor (crf), however, will be limited by "-maxrate" which specifies a maximum bitrate. (We will set bufsize to maxrate*2). According to documentation (link above), this setting does not strictly guarantee maxrate as bitrate, but we found it still complying pretty well, while still always aiming for an optimal result.~~


In [191]:
# test encode


def test_encode(directory):

    #init output
    df = pd.DataFrame()
    bitrates = []
    filenames = []

    #create dir for encodes
    os.makedirs(encode_loc)

    #iterate over scenes
    for filename in os.listdir(directory):
        
        scene_label=-1.0

        #new dir per scene
        scene_name=os.path.splitext(filename)[0]
        scene_encode_loc = os.path.join(encode_loc, scene_name)
        os.makedirs(scene_encode_loc)

        #do qp0 encode
        qp0_filename = f'{scene_encode_loc}/qp0_{input_vid_name}.mp4'
        qp0_command = f'ffmpeg -v error -f rawvideo -vcodec rawvideo -s {input_vid_res} -r {input_vid_fps} -pix_fmt {input_vid_pix_fmt} -i {orig_scenes_loc}/{filename} -c:v libx264 -preset ultrafast -qp 0 "{qp0_filename}"'
        qp0_result = subprocess.run(qp0_command, capture_output=True, text=True, shell=True)
        #print('QP0 RESULT: ', qp0_result)

        #init binary search
        low = 0
        high = len(bitrate_candidates) - 1
        mid = 0
 
        #iterations for binary search #TODO necessary?
        iterations=math.floor(math.log2(len(bitrate_candidates)))

        last_upper_candidate_vmaf = 100.0
        last_upper_candidate=bitrate_candidates[len(bitrate_candidates)-1]

        #BINARY SEARCH OVER BITRATES
        while range(iterations):

            # get current bitrate
            mid = (high + low) // 2
            current_bitrate=bitrate_candidates[mid]
            print(f'CURRENT BITRATE: {current_bitrate}')

            #do encode
            encoded_filename = f'{scene_encode_loc}/{current_bitrate}M_{input_vid_name}.mp4'
            encode_command = f'ffmpeg  -v error -f rawvideo -vcodec rawvideo -s {input_vid_res} -r {input_vid_fps} -pix_fmt {input_vid_pix_fmt} -i {orig_scenes_loc}/{filename} -c:v libx264 -b:v {current_bitrate}M -preset ultrafast -pass 1 -f null /dev/null &&    \
                                ffmpeg -v error -f rawvideo -vcodec rawvideo -s {input_vid_res} -r {input_vid_fps} -pix_fmt {input_vid_pix_fmt} -i {orig_scenes_loc}/{filename} -c:v libx264 -b:v {current_bitrate}M -preset ultrafast -pass 2 {encoded_filename}'
            encode_result = subprocess.run(encode_command, capture_output=True, text=True, shell=True)
            #print('ENCODE RESULT: ', encode_result)

            # calc vmaf
            vmaf_command = f'ffmpeg -i {encoded_filename} -i {qp0_filename} -filter_complex libvmaf -f null -'
            vmaf_result = subprocess.run(vmaf_command, capture_output=True, text=True, shell=True)
            #print('VMAF RESULT: ', vmaf_result)

            vmaf_score=extract_vmaf(str(vmaf_result))
            print('VMAF SCORE: ', vmaf_score)
            #print(f'current last upper candidate vmaf: {last_upper_candidate_vmaf}')

            # final iteration
            if low == high: # TODO check if always converges
                
                if vmaf_score > last_upper_candidate_vmaf or vmaf_score < minimum_acceptable_vmaf:
                    if last_upper_candidate_vmaf == 100:
                        print(f'ERROR: CANDIDATE WINDOW NOT FITTING!! DID NOT FIND ENCODE THAT IS ABOVE MINIMUM ACCEPTABLE VAMF. CLOSEST ENCODE FOUND AT BITRATE {current_bitrate} AND VMAF {vmaf_score}')
                        scene_label=current_bitrate
                        break
                    print(f'CONVERGED: LAST UPPERCANDIDATE IS OPTIMAL. current vmaf_score: {vmaf_score} vs last_upper_candidate_vmaf: {last_upper_candidate_vmaf}')
                    print(f'FINAL BITRATE LABEL FOR FILE {filename}: {last_upper_candidate}MB/s')
                    scene_label=last_upper_candidate
                
                elif vmaf_score < last_upper_candidate_vmaf and vmaf_score > minimum_acceptable_vmaf:
                    print(f'CONVERGED: CURRENT VMAF IS OPTIMAL. current vmaf_score: {vmaf_score} vs last_upper_candidate_vmaf: {last_upper_candidate_vmaf}')
                    print(f'FINAL BITRATE LABEL FOR FILE {filename}: {current_bitrate}MB/s')
                    scene_label=current_bitrate

                else:
                    print('ERROR DID NOT FIND OPTIMAL BITRATE')

                #delete qp0 encode
                #delete_encode(qp0_filename)

                #delete scene dir
                #os.rmdir(dir_path)

                break

            elif vmaf_score < minimum_acceptable_vmaf:
                low = mid + 1
        
            elif vmaf_score > minimum_acceptable_vmaf:
                high = mid - 1
                last_upper_candidate=current_bitrate
                last_upper_candidate_vmaf=vmaf_score
            
            else:
                print('ERROR DID NOT FIND OPTIMAL BITRATE')
        
        filenames.append(scene_name)
        bitrates.append(scene_label)


    # create label dir
    os.mkdir(label_loc)

    #o utput labels to csv
    df = df.assign(Name=filenames, Bitrate=bitrates)
    df.to_csv(f'{label_loc}/labels_{input_vid_name}_vmaf{minimum_acceptable_vmaf}_candidates{bitrate_candidates[0]}-{bitrate_candidates[len(bitrate_candidates)-1]}.csv')
    
    # delete encodes
    shutil.rmtree(encode_loc)
            
test_encode(orig_scenes_loc)



CURRENT BITRATE: 32
VMAF SCORE:  96.970692
CURRENT BITRATE: 16
VMAF SCORE:  95.024009
CURRENT BITRATE: 8
VMAF SCORE:  93.056883
CURRENT BITRATE: 4
VMAF SCORE:  89.608019
CURRENT BITRATE: 6
VMAF SCORE:  92.062709
CURRENT BITRATE: 5
VMAF SCORE:  91.150475
CONVERGED: LAST UPPERCANDIDATE IS OPTIMAL. current vmaf_score: 91.150475 vs last_upper_candidate_vmaf: 92.062709
FINAL BITRATE LABEL FOR FILE aspen_1080p-Scene-002.y4m: 6
CURRENT BITRATE: 32
VMAF SCORE:  99.639662
CURRENT BITRATE: 16
VMAF SCORE:  92.625429
CURRENT BITRATE: 8
VMAF SCORE:  79.615515
CURRENT BITRATE: 12
VMAF SCORE:  87.375983
CURRENT BITRATE: 14
VMAF SCORE:  90.207914
CURRENT BITRATE: 15
VMAF SCORE:  91.471999
CONVERGED: LAST UPPERCANDIDATE IS OPTIMAL. current vmaf_score: 91.471999 vs last_upper_candidate_vmaf: 92.625429
FINAL BITRATE LABEL FOR FILE aspen_1080p-Scene-003.y4m: 16
CURRENT BITRATE: 32
VMAF SCORE:  97.160659
CURRENT BITRATE: 16
VMAF SCORE:  86.902637
CURRENT BITRATE: 24
VMAF SCORE:  93.80889
CURRENT BITRATE: 2