# How ffprobe and ffmpeg has been installed and configured
I have an Apple Macbook, so I just used the command brew install ffmpeg which automatically installs both ffprobe
and ffmpeg.
The Coursera lab environment already has installed ffprobe and ffmpeg.
# Brief analysis of the application
I start with installing ffmpeg and ffprobe as described in the previous section.
As Python doesn’t provide a direct interface to ffprobe and ffmpeg I use the subprocess module from Python's
standard library. It allows to run Command Line applications directly from Python. To ease handling ffrprobe results, I
pass -print_format json as ffprobe argument.
I define several functions to analyze and convert video files:
1. get_film_file_info calls ffprobe and and returns a vide file metadata as Python dictionary.
2. validate_film_file calls get_film_file_info and validates a video file metadata according to the requirements.
The result is a list of validation errors.
3. convert_film_file calls ffmpeg and converts a passed video file to a new file with the required format.
Finally, I iterate over all the video files, validate them, and convert them if necessary.
# Brief description of the terms
 - Video format (container) - The file format used to store video. Each format has its own file extension (.avi, .mp4)
and describes a layout within the file: how and where the video and audio signals, metadata, and subtitles are
located.
 - Video codec - The algorithm used to encode and decode a video signal. It may be lossless (Huffyuv, FFV1) or lossy
(H.265, VP9).
 - Audio codec - The algorithm used to encode and decode an audio signal. It also may be lossless (FLAC, WavPack)
or lossy (MP3, AAC).
 - Frame rate - The number indicating how many images are shown per second during video playback. Often called
FPS (frames per second).
 - Aspect ratio - Proportion between a video's width and height. It may be expressed in a single number, but 2
numbers are used more often, for example, 16:9, and 4:3.
 - Resolution - Two numbers representing the number of pixels per video frame. They are usually presented as the
width and height of a video measured in pixels, for example, 800x600, 1920x1080.
 - Video bit rate - Number of bits representing one second of a video signal. It is typically measured in megabits per
second (Mbps). May vary during the video playback.
 - Audio bit rate - Number of bits representing one second of an audio signal. It is typically measured in kilobits per
second (kbps).
 - Audio channels - Number of separate audio signals representing a single perceived audio signal, for example,
mono (1 channel), stereo (2 channels), and surround sound (6-8 channels).

Install ffmpeg and fprobe on my Mac laptop.

In [2]:
!brew install ffmpeg

To reinstall 6.1.1_3, run:
  brew reinstall ffmpeg


Import required dependencies.

In [3]:
import subprocess
import json
import os

Define a function that uses fprobe to read metadata from a video file given its path, and returns the metadata as a Python dictionary.

In [5]:
def get_film_file_info(file_path):
    ffprobe_result = subprocess.run(
        [
            'ffprobe',               # call ffprobe
            '-v', 'error',           # print only error logs
            '-show_format',          # print format metadata
            '-show_streams',         # print streams metadata
            '-print_format', 'json', # use JSON as an output format
            file_path
        ],
        stdout=subprocess.PIPE, 
        stderr=subprocess.STDOUT
    )
    
    if ffprobe_result.returncode != 0:
        print(f'Error reading info from file ${file_path}')
        return None
    
    return json.loads(ffprobe_result.stdout)

Define a function that read a video file's metadata using the function above and validates if the file meets the requirements.

In [6]:
def validate_film_file(file_path):
    file_info = get_film_file_info(file_path)

    validation_errors = []

    file_format = file_info.get('format')
    if file_format is None:
        validation_errors.append('No format metadata presented in the file')
    else:
        format_name = file_format.get('format_name', file_path[:-3])
        if format_name is None:
            validation_errors.append('No format name available in the file')
        elif 'mp4' not in format_name.split(','):
            validation_errors.append(f'Video format (container): expected "mp4", actual "{format_name}"')
    
    streams = file_info.get('streams')
    if streams is None:
        validation_errors.append('No streams metadata presented in the file')
        return validation_errors
    elif len(streams) < 2:
        validation_errors.append(f'Number of streams: expected "at least 2 (video and audio)", actual "{len(streams)}"')
    
    audio_stream = next((stream for stream in streams if stream.get('codec_type') == 'audio'))
    if audio_stream is None:
        validation_errors.append('No audio stream metadata presented in the file')
    else:
        audio_codec_name = audio_stream.get('codec_name')
        if audio_codec_name != 'aac':
            validation_errors.append(f'Audio codec: expected "aac", actual "{audio_codec_name}"')

        audio_channel_layout = audio_stream.get('channel_layout')
        if audio_channel_layout != 'stereo':
            validation_errors.append(f'Audio channels: expected "stereo", actual "{audio_channel_layout}"')

        audio_bit_rate_str = audio_stream.get('bit_rate')
        audio_bit_rate = int(audio_bit_rate_str) if audio_bit_rate_str else None
        if not audio_bit_rate or audio_bit_rate > 256 * 1024 * 8:
            validation_errors.append(f'Audio bit rate: expected "up to 256 kb/s", actual "{audio_bit_rate / 8 / 1024} kb/s"')
        
    video_stream = next((stream for stream in streams if stream.get('codec_type') == 'video'))
    if video_stream is None:
        validation_errors.append('No video stream metadata presented in the file')
    else:
        video_codec_name = video_stream.get('codec_name')
        if video_codec_name != 'h264':
            validation_errors.append(f'Video codec: expected "h.264", actual "{video_codec_name}"')
    
        frame_rate_str = video_stream.get('r_frame_rate')
        number_of_frames, duration = map(int, frame_rate_str.split('/'))
        frame_rate = number_of_frames / duration
        if frame_rate != 25:
            validation_errors.append(f'Frame rate: expected "25 FPS", actual "{frame_rate} FPS"')

        width = video_stream.get('width')
        height = video_stream.get('height')
        if width != 640 or height != 360:
            validation_errors.append(f'Resolution: expected "640 x 360", actual "{width} x {height}"')
            
        aspect_ratio_str = video_stream.get('display_aspect_ratio')
        if aspect_ratio_str is not None:
            aspect_width, aspect_height = map(int, aspect_ratio_str.split(':'))
        else:
            aspect_width, aspect_height = width, height 
        if aspect_width * 9 != aspect_height * 16:
            validation_errors.append(f'Aspect ratio: expected "16:9", actual "{aspect_width}:{aspect_height}"')
            
        video_bit_rate_str = video_stream.get('bit_rate')
        video_bit_rate = int(video_bit_rate_str) if video_bit_rate_str else None
        if not video_bit_rate or not 2 * 1024 * 1024 * 8 <= video_bit_rate <= 5 * 1024 * 1024 * 8:
            validation_errors.append(f'Video bit rate: expected "2 – 5 Mb/s", actual "{video_bit_rate / 8 / 1024 / 1024} Mb/s"')
        
    return validation_errors

Define a function that converts a video file to the required format given a file path.

In [7]:
def convert_film_file(file_path):
    return_code = subprocess.run([
        'ffmpeg',                        # call ffprobe
        '-y',                            # rewrite an output file if already exists
        '-v', 'error',                   # print only error log
        '-i', file_path,                 # input file path
        '-c:v', 'libx264',               # desired video codec
        '-c:a', 'aac',                   # desired audio codec
        '-ac', '2',                      # desired audio channels
        '-b:a', '256k',                  # desired audio bit rate
        '-r', '25',                      # desired frame rate
        '-s', '640x360',                 # desired resolution
        '-aspect', '16:9',               # desired aspect ratio
        '-minrate', '2M',                # minimal video bit rate
        '-b:v', '3M',                    # desired video bit rate
        '-maxrate', '5M',                # maximal video bit rate
        '-bufsize', '1M',                # service option for configuring video bit rate
        file_path[:-4] + "_formatOK.mp4" # output file path
    ]).returncode
    
    if return_code != 0:
        print(f'Error converting file ${file_path}')
        return False
    
    return True

Run the algorithm.

In [8]:
root_path = 'Exercise3_Films'

def is_original_submitted_file(file_name):
    """
    Checks if the file is one of the assignment files but not some other file
    """
    return (not file_name.startswith('.') 
            and not file_name.endswith('_formatOK.mp4') 
            and os.path.isfile(root_path + '/' + file_name))

report_text = ''
for video_file in [file for file in os.listdir(root_path) if is_original_submitted_file(file)]:
    validation_results = validate_film_file(root_path + '/' + video_file)
    if len(validation_results) > 0:
        report_text += video_file + '\n'
        for validation_result in validation_results:
            report_text += '\t' + validation_result + '\n'

        convert_film_file(root_path + '/' + video_file)

with open('report.txt', 'w') as f:
    f.write(report_text)

print(report_text)

Last_man_on_earth_1964.mov
	Audio codec: expected "aac", actual "pcm_s16le"
	Video codec: expected "h.264", actual "prores"
	Frame rate: expected "25 FPS", actual "23.976023976023978 FPS"
	Video bit rate: expected "2 – 5 Mb/s", actual "1.106881022453308 Mb/s"
Voyage_to_the_Planet_of_Prehistoric_Women.mp4
	Audio codec: expected "aac", actual "mp3"
	Video codec: expected "h.264", actual "hevc"
	Frame rate: expected "25 FPS", actual "29.97002997002997 FPS"
	Video bit rate: expected "2 – 5 Mb/s", actual "0.9583064317703247 Mb/s"
The_Gun_and_the_Pulpit.avi
	Video format (container): expected "mp4", actual "avi"
	Audio codec: expected "aac", actual "pcm_s16le"
	Audio channels: expected "stereo", actual "None"
	Video codec: expected "h.264", actual "rawvideo"
	Resolution: expected "640 x 360", actual "720 x 404"
	Aspect ratio: expected "16:9", actual "720:404"
	Video bit rate: expected "2 – 5 Mb/s", actual "10.423526525497437 Mb/s"
Cosmos_War_of_the_Planets.mp4
	Frame rate: expected "25 FPS",