Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
353 lines (245 sloc) 15.3 KB
This project implements a complete(!) JPEG (Rec. ITU-T T.81 | ISO/IEC
10918-1) codec, plus a library that can be used to encode and decode
JPEG streams. It also implements ISO/IEC 18477 aka JPEG XT which is
an extension towards intermediate, high-dynamic-range lossy and
lossless coding of JPEG. In specific, it supports ISO/IEC
18477-3/-6/-7/-8/-9 encoding.
Unlike many other implementations, libjpeg also implements:
- 12 bpp image coding for the lossy DCT process,
- the predictive lossless mode of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the hierarchical process of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- the arithmetic coding option of Rec. ITU-T T.81 | ISO/IEC 10918-1,
- coding of up to 256 component images
- upsampling of images for all factors from 1x1 to 4x4
Standard features are of course also supported, such as
sequential and progressive mode in 8bpp.
In addition, this codec provides methods to encode images
- with a bit depth between 8 and 16 bits per sample, fully backwards
compatible to Rec. ITU-T T.81 | ISO/IEC 10918 baseline coding.
- consisting of floating point samples, specifically images with
high dynamic range.
- to encode images without loss, regardless of their bit-depth and their
sample data type.
Example usage:
Standard JPEG compression, with 444 (aka "no") subsampling:
$ jpeg -q <quality> infile.ppm outfile.jpg
Standard JPEG compression, with 422 subsampling:
$ jpeg -q <quality> -s 1x1,2x2,2x2 infile.ppm outfile.jpg
Intermediate dynamic range compression, i.e. compression of images
of a bit-depth between 8 and 16 bits:
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -r12 infile.ppm outfile.jpg
This type of encoding uses a technology known as "residual scans" which
increase the bit-depths in the spatial domain which is enabled by the -r
command line switch. The -Q parameter sets the quality of the residual image.
To improve the precision in the frequency domain, "refinement scans" can be used.
The following encodes a 12-bit image with four additional refinement scans,
enabled by the "-R 4" parameter.
$ jpeg -q <quality> -R 4 -h infile.ppm outfile.jpg
Both technologies can be combined, and the precision of the residual scan
can also be enlarged by using residual refinement scans with the -rR option.
The following command line with use a 12-bit residual scan with four refinement
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -rR 4 infile.ppm outfile.jpg
High-dynamic range compression allows three different profiles of varying
complexity and performance. The profiles are denoted by "-profile <X>" where
<X> is a,b or c. The following encodes an HDR image in profile C:
$ jpeg -r -q <base-quality> -Q <extension-quality> -h -profile c -rR 4 infile.pfm outfile.jpg
HDR images here have to be fed into the command line in "pfm" format.
exr or hdr is not supported as input format and requires conversion to
pfm first. pfm is the floating-point equivalent of ppm and encodes each
pixel by three 32-bit floating point numbers.
Encoding in profiles a and b works likewise, though it is generally advisable to
use "open loop" rather than "closed loop" coding for these two profiles by
additionally providing the "-ol" parameter. This also works for profile C:
$ jpeg -ol -r -profile a -q <base-quality> -Q <extension-quality> -h infile.pfm out.jpg
similar for profile B.
What is common to profiles A and C is that you may optionally also specify
the LDR image, i.e. the image that a legacy JPEG decoder will show. By default,
a simple tone mapping algorithm ("global Reinhard") will be used to derive a
suitable LDR image from the input image:
$ jpeg -ldr infile.ppm -q <base-quality> -Q <extension-quality> -h -rR 4 infile.pfm out.jpg
The profile is by default profile c, but it also works for profile a:
$ jpeg -ol profile a -ldr infile.ppm -q <base-quality> -Q <extension-quality> infile.pfm out.jpg
It is in general advisable for profile c encoding to enable residual refinement scans,
profiles a or b do not require them.
The following options exist for lossless coding integer:
predictive Rec. ITU-T T.81 | ISO/IEC 10918-1 coding. Note, however,
that not many implementations are capable of decoding such stream,
thus this is probably not a good option for all-day purposes.
$ jpeg -p -c infile.ppm out.jpg
While the result is a valid Rec. ITU-T T.81 | ISO/IEC 10918-1 stream,
most other implementations will hick up and break, thus it is not
advisable to use it.
A second option for lossless coding is residual coding within profile c:
$ jpeg -q <quality> -Q 100 -h -r infile.ppm out.jpg
This also works for floating point coding. Note that lossless coding is enabled
by setting the extension quality to 100.
$ jpeg -q <quality> -Q 100 -h -r infile.pfm out.jpg
However, this is only lossless for 16 bit input samples, i.e. there is a precision
loss due to down-converting the 32-bit input to 16 bit. If samples are out of the
601 gamut, the problem also exists that clamping will happen. To avoid that,
encode in the XYZ color space (profile C only, currently):
$ jpeg -xyz -q <quality> -Q 100 -h -r infile.pfm out.jpg
A second option for lossless integer coding is to use a lossless 1-1 DCT
process. This is enabled with the -l command line option:
$ jpeg -l -q 100 -c infile.ppm out.jpg
Refinement scans can be used to increase the sample precision to up to 12
bits. The "-c" command line option disables the lossy color transformation.
Additionally, this implementation also supports JPEG LS, which is
outside of Rec. ITU-T T.81 | ISO/IEC 10918-1 and ISO/IEC 18477. For
that, use the command line option -ls:
$ jpeg -ls -c infile.ppm out.jpg
The "-c" command line switch is necessary to disable the color transformation
as JPEG LS typically encodes in RGB and not YCbCr space.
Optionally, you may specify the JPEG LS "near" parameter (maximum error) with
the -m command line switch:
$ jpeg -ls -m 2 -c infile.ppm out.jpg
JPEG LS also specifies a lossless color transformation that is enabled with
$ jpeg -ls -cls infile.ppm out.jpg
To encode images with an alpha channel, specify the source image that
contains the alpha channel with -al. The alpha channel is a one-component
grey-scale image, either integer or floating point. The quality of the
alpha channel is specified with -aq, that of the regular image with -q:
$ jpeg -al alpha.pgm -aq 80 -q 85 input.ppm output.jpg
Alpha channels can be larger than 8bpp or can be floating point. In both
cases, residual coding is required. To enable residual coding in the alpha
channel, use the -ar command line option. Similar to the regular image,
where residual coding requires two parameters, -q for the base quality and
-Q for the extension quality, an alpha channel that uses residual coding
also requires a base and extension quality, the former is given by -aq,
the latter with -aQ:
$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 90 input.ppm out.jpg
The alpha channel can be encoded without loss if desired. For that, enable
residual coding with -ar and specify an extension quality of 100:
$ jpeg -ar -al alphahigh.pgm -q 85 -Q 90 -aq 80 -aQ 100 input.ppm out.jpg
The alpha channel can use the same technology extensions as the image,
namely refinement scans in the base or extension image, or 12-bit residual
images. The number of refinement scans is selected with -aR and -arR for
the base and residual image, a 12-bit residual image is selected with -ar12.
Decoding is much simpler:
$ jpeg infile.jpg out.ppm
or, for floating point images:
$ jpeg infile.jpg out.pfm
If you want to decode a JPEG LS image, then you may want to tell the
decoder explicitly to disable the color transformation even though the
corresponding marker signalling coding in RGB space is typically missing
for JPEG LS:
$ jpeg -c infile.jpg out.ppm
If an alpha channel is included in the image, the decoder does not
reconstruct this automatically, nor does it attempt to merge the alpha
image into the file. Instead, it may optionally be instructed to write the
alpha channel into a separate 1-component (grey-scale) file:
$ jpeg -al alpha.pgm infile.jpg outfile.ppm
The -al option for the decoder provides the target file for the alpha
Starting with release 1.30, libjpeg will include a couple of optimization
parameters to improve the performance of JPEG and JPEG XT. In this
release, the following additional command line switches are available:
-qt <n> : Selects a different quantization table. The default table,
also enabled by -qt 0, is the one in the legacy JPEG standard
(Rec. ITU-T T.81 | ISO/IEC 10918-1). -qt 1 is the "flat" table for
PSNR-optimal performance. It is not recommended for real-life usage as
its visual performance is non-ideal, it just generates "nice
numbers". -qt 2 is MS-SSIM ideal, but similarly, not necessarily a
good recommendation for all-day use. -qt 3 is a good compromize and
usually works better than -qt 0.
-dz : This option enables a deadzone quantizer that shifts the buckets
by 1/8th of their size to the outside. This is (almost) the ideal choice
for Laplacian sources which would require a shift of 1/12th. Nevertheless,
this option improves the rate-distortion performance by about 0.3dB on
average and works pretty consistent over many images.
Additional options are planned for future releases.
Release 1.40:
In this release, we included additional support for "full profile" encoding, i.e.
encoding parameters that do not fit any of the four profiles specified in 18477-7.
Using such encoding parameters will generate a warning on the command line, but
encoding will proceed anyhow, generating a bitstream that conforms to 18477-7, but
not to any of the profiles in this standard.
With "-profile a -g 0" or "-profile b -g 0" the encoder will generate a file that
uses an inverse TMO lookup similar to profile C with other encoding parameters
identical to those defined by profiles A and B.
The command line option "-lr" will use a logarithmic encoding instead of the gamma
encoding for profile B. Again, this will leave the profile, but will be within the
bounds of 18477-7.
Other than that, a couple of bug fixes have been made. Profile A and B setup could
not reset the toe value for the inverse gamma map, due to a typo of one of the
parameters. Profile B accepted a different gamma value than the default, but never
communicated it to the core code, i.e. it was simply ignored. Profile B setup ignored
the epsilon values for numerator and denomiator, and they were communicated wrongly
into the core code. This was corrected, and epsilons can now be specified on the
command line.
Release 1.50:
This release fixes encoding of ISO/IEC 18477-8 if the IDCT was selected as
transformation in the extension layer and refinement scans were added, i.e.
the command line options -rl -rR 4 created invalid codestreams. Previous
releases used the wrong type of refinement scan (dct bypass refinement instead
of regular refinement) and hence broke reconstruction. Furthermore, previous
releases no longer allowed near lossless coding with DCT bypass. Instead, regular
DCT coding conforming to ISO/IEC 18477-7 was used. To enable the near-lossless
DCT bypass mode, use the new option "-ro" now.
Profile B encoding could potentially create codestreams that run into
clipping of the extension channel; this always happens if the denominator is
larger than 1, and has to happen according to Annex C of ISO/IEC 18477-3.
This release avoids this issue by adjusting the exposure value such that
the denominator always remains smaller than 1.
Release 1.51:
If the JPEG-XT markers were delayed to the frame-header intead the global
header, the previous code did not built up the necessary infrastructure
to compute the checksum and hence could not verify the checksum in such
a condition. The 1.51 release fixes this problem.
Release 1.52:
This file is an updated/enhanced version of the 1.51 release of
the JPEG XT demo software found on It
includes additional features presented in the paper
"JPEG on Steroids : Common Optimization Techniques for JPEG Image Compression"
by the same author.
In specific, the following command line flags are *NEW* to this version and
are available only as a contribution to ICIP 2016:
-oz: This enables the dynamic programming algorithm to enhance
the rate-distortion performance by soft-threshold quantization. It has been
used for the tests in section 3.3 of the paper.
-dr: This enables the smart de-ringing algorithm that has been used
in section 3.6.
Additionally, the following switches have been used for other subsections
of the paper; they are not new to this distribution but available as
part of the regular libjpeg distribution at github or
-s 1x1,2x2,2x2: Enable 420 subsampling (444 is default)
-s 1x1,2x1,2x1: Enable 422 subsampling (444 is default)
-qt n (n=0..8) Use quantization matrix n.
In the paper, n=1 (flat) was used for PSNR-optimized
coding, unless otherwise noted.
-dz The deadzone quantizer in section 3.3
(simpler than -oz)
-v Enable coding in processive mode (section 3.5)
-v -qv Optimized progressive mode (section 3.5)
-h Optimized Huffman coding (always used, unless noted
otherwise, see section 3.4)
Release 1.53:
This release includes additional functionality to inject markers, or
retrieve markers from a codestream while reading. For that, set
the JPGTAG_ENCODER_STOP tag of the JPEG::Write() call to a bitmask
where the encoder should interrupt writing data (this flag already
existed before) then write custom data with JPEG::WriteMarker(), then
continue with JPEG::Write(). On decoding, set JPGTAG_DECODER_STOP to
a bitmask where to stop for markers, then identify markers with
JPEG::PeekMarker(), and retrieve them with JPEG::ReadMarker(). Details
can be found in cmd/encodec.cpp for encoding, and cmd/reconstruct.cpp.
Otherwise, no functional changes.
For license conditions, please check the file README.license in this
Finally, I want to thank Accusoft and the Computing Center of the University of
Stuttgart for sponsoring this project.
Thomas Richter, February 2017