Open source tools for a digital cinema pipeline
Clone this wiki locally
Apr 30 2013: Quite some time has passed since this page was written. Take with a grain of salt.
The following are basic building blocks for an ad-hoc Digital Cinema pipeline based on open source tools. Keep in mind that none of this is overly practical for production, though:
- There is a lot more to color and the required color science than the simplified examples below might suggest. They are merely a starting point into the fascinating field of color math. Considering the variety of DSM source colorspaces (and its associated gamma encodings) it becomes obvious that production pipelines will be confronted with delicate and crucial color issues. See OpenColorIO for an interesting toolset related to that.
- Another key problem is the complexity of the JPEG2000 encoding. Expressed in computing cycles it is a very expensive process. It is also only in parts a parallelable process (which would be where the massive amounts of processing units — like on GPUs — could help).
- FPGA/ASIC solutions (not easily affordable) achieve “realtime” encoding by casting the process in iron. See intoPIX for an example.
- There is CUDA (on off-the-shelf GPUs, quite affordable) and the likes (See CUJ2K for an example — alas, as of October 2011, CUJ2K v1.1 does not produce DCI compliant Profile-3 and Profile-4 codestreams.)
- Last but not least there are efforts to improve the situation by massive multi-threading where it still takes significant time to encode 1 frame but doing it n times on n CPU cores will offset the costs. See Terrence Meiczinger’s OpenDCP and Carl Hetherington’s DCP-o-matic which will spread out encoding over a range of network nodes if asked.
- Efficiency: Setting up a shell script to walk all those steps is as easy as it is inefficient. Every image file would be touched a number of times. Every step would involve disk access to and fro. Some of the tools are not multi-threaded. Etc.pp.
Still, it’s a perfectly viable process for tests and short features. And, most of all, it’s a transparent process.
- Prepare the source material (Create an image sequence, wrap audio in Broadcast WAV container)
- Linearize (s)RGB
- Transform color from source color to XYZ (see this nice visual CIE XYZ primer)
- Chromatic adaptation to the targeted white point (need to understand the problem better, see White Gamut)
- Encode with SMPTE spec’d gamma (2.6)
- Compress to JPEG2000 codestream files (The JPEG 2000 Profiles 3 and 4, used here, are lossy, btw)
- Collect assets (images, audio) in MXF containers/streams
- Create the basic infrastructure of a DCP (composition playlist, packing list, assetmap, volindex)
- How to create an image sequence from a video file: FFmpeg and MPlayer/Mencoder are swiss army knives for pretty much anything you want to do with video and image sequences. In this context FFmpeg is used to create an image sequence from a given video file.
ffmpeg -i video.m2v -f image2 -vcodec tiff %06d.tiff
%06d tells ffmpeg to create 6-digit zero-padded filenames.
- How to convert sRGB (Gamma ~2.2, see this note on sRGB gamma) to X’Y’Z’ (Gamma 2.6): ImageMagick’s convert lets you perform color transforms, handle embedded color profiles, adjust bit-depth, scale, gamma de/encode etc.
For example (assuming sRGB source)
for f in *tiff; do echo $f; convert $f -alpha off -depth 12 -gamma 0.454545 -resize 1998x1080 -recolor "0.4124564 0.3575761 0.1804375 0.2126729 0.7151522 0.0721750 0.0193339 0.1191920 0.9503041" -evaluate multiply 0.9166 -gamma 2.6 xyz-2.6-$f; done
takes all tiffs in the current directory, drops the alpha channel, scales bit-depth to the SMPTE spec’d 12 bits per component, decodes gamma (from sRGB’s standard and approximated 2.2), scales to 2K (keeping the original aspect ratio), applies an sRGB to XYZ color transform, corrects for DCinema Luminance level (Edit 2015.01.31: See mboufleur’s investigation) and finally encodes to the SMPTE spec’d gamma of 2.6. (You want to transform color with linear values, hence the gamma decode, then the color transform and then the concluding gamma encode.)
The convert above merely demonstrates a number of operations. It would have to be tweaked to meet any real-world requirements:
- With ImageMagick the order of operators and settings matters. If your source material has 16-bit color channels you wouldn’t want to scale to 12 bits before gamma decoding or scaling or the color transform but afterwards.
- sRGB’s gamma is not exactly 2.2. A correct linearization of sRGB would be
convert srgb-gradient.tiff -fx "p <= 0.04045 ? p / 12.92 : ((p + 0.055) / 1.055) ^ 2.4" linear.tiff
There’s a linear part for values below 0.04045 and a power function for higher values. The difference is subtle but real (Easy to see with gradients and even easier with a posterize action to, like, 16 levels). Because color transforms have to be performed on linear values any errors from linearization will propagate into color errors on the silver screen.
See Wikipedia for the math and an explanation for why 0.04045 is used instead of 0.03928. And, to close the circle, here is a correct sRGB companding function in ImageMagick’s
p <= 0.0031308 ? p * 12.92 : 1.055 * p ^ ( 1 / 2.4 ) - 0.055
- Jussi Siponen mentions an issue with DCI-compliant image dimensions. He suggests padding (within a convert statement) for non-compliant dimensions, like
-background black -gravity center -extent 1998x1080for flat 2K and
-background black -gravity center -extent 2048x858for scope 2K. It’d be interesting to learn whether some cinema servers actually choke on non-compliant image dimensions. The XDC G3 doesn’t.
- How to encode to DCI compliant JPEG2000 (Profiles 3 and 4): OpenJPEG’s image_to_j2k implements JPEG Profiles 3 and 4 (for 2K and 4K material, respectively) and creates JPEG2000 codestream files.
image_to_j2k -cinema2K 24 -ImgDir XYZ -OutFor j2c
encodes all tiffs in XYZ to DCI compliant JPEG2000 codestream files (Suffix “j2c”). This is a lossy process as Profiles 3 and 4 specify maximum file sizes and bit rates of streams. Encoding time for a 2K tiff on a recent Intel core is around 2-4 seconds. That’s around 5 days for a 100-minute feature. So there :)