Skip to content
justinruggles edited this page Sep 13, 2010 · 21 revisions

FFmpeg MPEG-4 ALS Audio Encoder

The master branch is a mirror of git://git.ffmpeg.org/ffmpeg.
The alsenc branch contains the work-in-progress ALS encoder.
The alsenc-jbr branch contains the simple ALS encoder written by Justin in 2008.

To Checkout and Build:

git clone git://github.com/justinruggles/FFmpeg-alsenc.git ffmpeg
cd ffmpeg
git clone git://git.ffmpeg.org/libswscale (libswscale is needed to build FFmpeg)
git checkout -b alsenc origin/alsenc
mkdir build
cd build
../configure
make

About MPEG-4 ALS in FFmpeg

As of November 2009, FFmpeg supports decoding of MPEG-4 ALS audio. The decoder was added in revision 20517. It was written by Thilo Borgmann as part of the Google Summer of Code 2009. There are still some unsupported features, but the most important ones are supported.

The goal of the ALS encoder hosted in this repository is to be a compatible encoder, written from scratch, to be eventually included in FFmpeg. It should be faster than the reference encoder provided by the ISO and hopefully get better compression. The basic design will aim to be as flexible as possible so as to easily allow speed vs. compression trade-offs and to facilitate easy integration of new encoding algorithms.

MPEG-4 ALS has not gained much use as a lossless format, but I think that has much to due with the lack of widespread application support. Since FFmpeg is used by many applications, especially in the open source world, adding both decoding and encoding support to FFmpeg may help to make its use more widespread. In the vast field of lossless codecs where the technical advantages of one format over another are slim, but the fact that ALS is an international standard is a plus. I believe that the addition of the open source tools that come as a part of being in FFmpeg, along with some of the format’s technical advantages (including MP4 encapsulation) will help to spread the use of ALS.

About MPEG-4 ALS

MPEG-4 ALS is an international standard that defines a lossless audio format. The most recently published document is ISO/IEC 14496-3:2005/Amd.2:2006. There is also a more recent unpublished final draft of the next (4th) edition of 14496-3 that contains some changes to ALS. The NUe Group ALS homepage contains a lot of information regarding the history of ALS, as well as related publications.

The basic process of ALS encoding is block partitioning, prediction, channel differencing, and entropy coding. Each step in the process has some choices as to the algorithm used.

Block Partitioning

The audio stream is divided into frames with an equal number of samples (except for the last frame). Each frame can then optionally be sub-divided into blocks to improve compression.

Prediction

The core prediction algorithm can be forward-adaptive LPC prediction or backward-adaptive RLS-LMS prediction. For LPC prediction, the PARCOR coefficients are quantized and coded in the bitstream using Rice coding. The specification defines a lossless conversion between LPC and PARCOR coefficients. An optional long-term prediction (LTP) can also be applied in addition to the core short-term prediction. LTP is a form of pitch-lag prediction, similar to that used in some speech codecs.

Channel Differencing

There are 2 kinds of channel differencing used. Joint stereo coding pairs together groups of 2 channels and encodes one of them as a difference signal. Multi-channel coding (MCC) uses weighted channel differencing for multiple channels.

Entropy Coding

Entropy coding is done using either Rice or BGMC (arithmetic) coding. The arithmetic coding compresses more, but is slower for both encoding and decoding.

Shared Code with the FLAC Encoder

Some of the functions used in the ALS encoder can be shared with FFmpeg’s FLAC encoder.


  • Rice coding : Determining entropy partitioning and optimal Rice parameters.

  • Joint channel coding : Possible sharing of joint coding mode estimation.

  • LPC : Windowing, Autocorrelation, and Levinson-Durbin