Skip to content

Video Encoding with FFMPEG

George Stoyanov edited this page Jul 4, 2018 · 22 revisions

Video Encoding

There are many video codecs existing and in use in the real world. Here there are described only the most widely used codecs, which also provide good quality/bandwidth ration.

H.264 Encoding

The H.264 is by far the most used codec nowadays and it is also known as MPEG-4 part 10 and it is broadly supported by most of the today's decoding devices. The syntax of the encoding command using ffmpeg is:

ffmpeg -i <source> \
-c:v libx264 -an -preset ultrafast -crf 20 -pix_fmt yuv422p <output>

where:
-i defines the path to the input video file
-c:v libx264 - defines the codec used for encoding the video
-an - defines that the output should not have audio. For audio you can use also -c:a copy or encode the audio as well
-preset - defines the preset. The slower the preset the better the output video quality is but also the longer the encoding process takes. The possible presets are: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, **veryslow ** and placebo. The placebo preset helps at most ~1% compared to the veryslow preset at the cost of a much higher encoding time. It's diminishing returns: veryslow helps about 3% compared to the slower preset, slower helps about 5% compared to the slow preset, and slow helps about 5-10% compared to the medium preset.
-crf - defines the constant rate factor. This method allows the encoder to attempt to achieve a certain output quality for the whole file when output file size is of less importance. This provides maximum compression efficiency with a single pass. Each frame gets the bitrate it needs to keep the requested quality level. The downside is that you can't tell it to get a specific filesize or not go over a specific size or bitrate. The range of the quantizer scale is 0-51: where 0 is lossless, 23 is default, and 51 is worst possible. A lower value is a higher quality and a subjectively sane range is 18-28. Consider 18 to be visually lossless or nearly so: it should look the same or nearly the same as the input but it isn't technically lossless. The range is exponential, so increasing the CRF value +6 is roughly half the bitrate while -6 is roughly twice the bitrate. General usage is to choose the highest CRF value that still provides an acceptable quality. If the output looks good, then try a higher value and if it looks bad then choose a lower value.
-pix_fmt - defines the used pixel format. By default the pixel format is set to YUV 422 Progressive or yuv422p but you can also use different formats.

You can encode the video lossless using constant rate factor crf = 0 The fastest way to encode it is using the ultrafast preset: ffmpeg -i input -c:v libx264 -preset ultrafast -crf 0 output.mkvand the preset which will provide you with the best compression is using the veryslow preset: ffmpeg -i input -c:v libx264 -preset veryslow -crf 0 output.mkv

Frame Rate conversion

By default when transcoding a video the frame rate will be reported by mediainfo as variable, in order to fix that you can add -x264opts force-cfr which will fix the frame rate.

GoP size

Another useful option is the definition of the gop-size. This can be done using another x264 options: -x264opts keyint=1 This will create a video with only I-frames. The keyint=xxx - determines the maximum distance between I-frames. The recommended default key interval is 250.

Interpolated video

When encoding interpolated video make sure to add the two flags: -flags +ilme+ildct. So a standard transcoding line in FFMPEG for creation interpolated PAL CBR TS file with fixed frame rate is:

ffmpeg -i input.mp4 \
        -fflags genpts \
        -c:v libx264 -r 50 -s 720:576 \
        -flags +ilme+ildct -top 1 \
        -x264opts keyint=250:nal-hrd=cbr:force-cfr:keyint=100:min-keyint=100 \
        -b:v 50M -minrate:v 50M -maxrate:v 50M -muxrate 70M -bufsize:v 130M -pcr_period 35 \
        -c:a aac -ac 2 -b:a 128k \
        -f mpegts output.ts

-fflags genpts - generates missing PTS if the DTS is present, this will recreate your packet timestamps
keyint - defines the GoP (group of pictures) size in frames, or how many P and B frames you will have in between the I-frames, the higher keyint the better encoding quality but this will also increase the decoding time. It is considered to be a good practice to have around 2 seconds of GoP (2 * 50fps = 100fps), which represents a good compromise
min-keyint - defines if the GoP will be open or closed, if the min-keyint = keyint then we have closed GoP. Usually it is good idea to have open GoP and give the encoder to decide when to insert I-frames, sometimes when you have a sudden frame change the encoder is inserting another I-frame but this is also reducing the quality of the encoding.
ilme flag - forces interlaced motion estimation
ildct flag - use interlaced DCT (discrete cosine transform)
top 1 - sets the output to top field first. Usually in PAL standard the interlaced video is using top-field first. The interlaced SD content in DVD on the other hand is using bottom field first.
-pcr-period 35- is setting the PCR cycling time to 35ms. By default the PCR cycling time should be less than 40ms, so 35 or even lower value should be fine.
nal-hrd=cbr set the hypothetical reference decoder (HRD) to CBR and pack the bitstream to the specified bitrate

Color transformation, multiplexing and color conversion and image conversion (Complex case)
ffmpeg -analyzeduration 100M -probesize 100M \
       -loop 1 -i image.jpg -i left.wav -i right.wav \
       -map 0 -map 1 -map 2 \
       -c:v libx264 -pix_fmt yuv422p10le -coder 0 -t 12 -r 50 \
       -x264opts keyint=1:intra-refresh=1:force-cfr \
       -color_primaries 1 -color_trc 1 -colorspace 1 \
       -c:a copy \
       video.mxf

So this command will create a new video.mxf file consisting three streams, a static picture for the video, and two mono channels left.wav and right.wav. The different settings here are:
-map 0/1/2 - streams mapping of all three streams into the output file
-coder 0 - disabling of the CABAC encoder. CABAC is the default entropy encoder used by x264. Though somewhat slower on both the decoding and encoding end, it offers 10-15% improved compression on live-action sources and considerably higher improvements on animated sources, especially at low bitrates.
-r 50 - frame rate
keyint=1 - create a video with only I-frames
intra-refresh=1 - create IDR frames
force-cfr - force the frame rate to be constant
-color_primaries 1 - color_trc 1 - colorspace 1 - force the output to be encoded in BT.709 colorspace
-pix_fmt yuv422p10le - force the pixel format to be 4:2:2 with 10 bits

CBR Encoding
$ ffmpeg -i <input> \
-c:v libx264 -x264opts nal-hrd=cbr \
-b:v 30M -minrate:v 30M -maxrate:v 30M -muxrate 35M -bufsize:v 25M \
-c:a aac -ac 2 -b:a 128k \
-f mpegts <output>

So this command will create completely CBR TS from an input file. Here it is important the video bitrate (b:v) to be equal to the video maximal and minimal video bitrate, the muxrate has to be 10-15% higher than the set video bitrate including the audio bitrate and the buffsize to be around 70% of the video bitrate. This command will create a CBR TS with 10-15% of stuffing.

If you get dts < pcr, TS is invalid error, you should increase the size of the muxrate.

Note: For more information you can refer to X264 Encoding Guide

H.265 Encoding

The H.265 encoding algorithm is the successor of H.264 encoding and it promises almost twice bandwidth reduction for achieving the same video quality. Unfortunately this is also related to much slower encoding process. There are currently two codecs available for H.265 and Kvazaar. x265 is by far more popular than Kvazaar.

x265 Encoding

The syntax of the x265 command is exactly the same as the x264 command so you can refer to the above section. Usage:

ffmpeg -i <source> \
-c:v libx265 -an -preset ultrafast -crf 20 -pix_fmt yuv422p <output>
Kvazaar Encoding

Kvayaar is an open-source HEVC encoder licensed under LGPLv2.1. Kvazaar is not yet finished and does not implement all the features of HEVC. Compression performance will increase as we add more coding tools. The syntax of kvazaar is very similar to x264 but you can also use the kvazaar encoder:

ffmpeg -i <input> \
-c:v libkvazaar -an -preset ultrafast -lossless -o <output>

where:
-c:v libkvazaar - defines the encoding library
-an - cuts the audio from the encoding
-preset ultrafast - defines the preset
-lossless - defines that we want to encode the video lossless, without losing any data
-o - sets the output path
You can also use the built-in kvazaar encoder stand alone. Here is an example command:

kvazaar -i <input> \
--input-res <width>x<height> --preset <preset> --gop xxx -n xxx --lossless --no-psnr \
-o <output>

where:
-i - specifies the path to the input file
--input-res x - specifies the resolution in pixels of the input file
--preset - defines the used preset.
--gop xxx - defines the GoP structure of the encoding. 0: disabled or X: B-frame pyramid of length X
-n xxx - specifies the number of frames to code
--lossless - specifies lossless encoding of the content
--no-psnr - disables the PSNR calculation
-o - defines the path to the output file

x265 CBR Encoding

Unfortunately I didn't find a way to create CBR ts output here. This is the command which is creating the closest to CBR bitrate:

ffmpeg -i <input> \
-c:v libx265 -b:v 25M -muxrate 30M \
-x265-params strict-cbr=1:vbv-bufsize=25000:vbv-maxrate=25000 \
-c:a aac -ac 2 -b:a 128k \
-f mpegts <output.ts>

where you pass the following x265 parameters:
- muxrate to set the rate of the multiplex.
-b:v to set the video bitrate
strict-cbr=1 to force the use of CBR encoding
vbv-bufsize to set the size of the video buffer in Kbps
vbv-maxrate to set the maximal video bitrate in Kbps

In this case the output TS file will be 30Mbps and approximately 5Mbps of it will be stuffing bits. Please note that all parameters passed after the -x265-params should be separated with :

If you get dts < pcr, TS is invalid error, you should increase the size of the muxrate.

VP9 Encoding

VP9 is an open and royalty free[1] video coding format developed by Google. VP9 is a successor to VP8 and competes with MPEG's High Efficiency Video Coding (HEVC/H.265). Usage:

ffmpeg -i <source> \
-c:v libvpx-vp9 -pass 1 -b:v 10000K -threads 8 -speed 4 -an \
<output.webm>

The command is very similar to x264 encoding the differences are:
-c:v libvpx-vp9 - defines the used encoding library
-pass - defines a single pass encoding
-b:v - specifies the target bitrate of encoding. This parameter is completely optional.
-threads - defines the number of threads which to be allocated to the encoding process. Please note that the number of threads for Intel CPU's is 2 times the number of the CPU cores. So for examle if you have a quad core and you want to allocate all ressources to the encoding the number of threads should be 8.
-speed 4 - tells VP9 to encode really fast, sacrificing quality. Useful to speed up the first pass. Speed 1 is a good speed vs. quality compromise. Produces output quality typically very close to speed 0, but usually encodes much faster.

Note: Please note that the recommended container of files encoded with VP9 is webm.

FFV1 Encoding

References:

Comparison of Video Codecs and Containers
List of YUV Formats
Comparison of Container Formats
X264 Encoding Guide
X265 Commands
x264 FFmpeg Options Guide
Kvazaar - Open Source HEVC Encoding Library GitHub Project
Kvazaar Official Webpage
VP9 Compression Guide
VP9 Encoder Parameters
VP9 vs. H.264 vs. H.265 Comparison
FFV1 Video Encoding
FFMPEG Presets Github
HW Acceleration