# MPEG Audio

* Input: a sequence of 16-bit PCM samples.

* Output: a sequence of MPEG Audio frames (frame = header + code-stream) which can be streamed.

```
audio   +-----------+       +--------------+         +----------+
channel | Time      |       | Quantization |         |          | MPEG audio
---+--->| frequency |---+-->| and          |-------> | Framming |------->
   |    | mapping   |   |   | coding       |         |          |
   |    +-----------+   |   +--------------+         +----------+
   |                    |           ^
   |                    |           |
   |                    |   +--------------+
   |                    +-->| Phycho-      |
   |                        | acustic      |
   +----------------------->| model        |
                            +--------------+
```

* The MPEG audio bitstream definition is normative. Most guidance about encoding
is informative. Thus, two MPEG-compliant bitstreams that encode the same audio material at
the same rate but on different encoders may sound very different. On the other hand, a given
MPEG bitstream decoded on different decoders will result in essentially the same output.

## [Layer I](https://en.wikipedia.org/wiki/MPEG-1_Audio_Layer_I)

* 4:1 compression (384 kbps).
* CBR (Constant Bit-Rate).

### Encoder

1. Split $s[n]$ into blocks of $12\times 32=384$ samples. For each block:
   1. Analyze the block using a 32-band (analysis) filter bank, producing $12$ coeffs/subband (the coeffs are downsampled by $32$).
   2. Scale each block of $12$ coeffs to ensure that the entire range of the selected quantizer will be used. Output the *scalefactor*.
   3. Using the [FFT](https://en.wikipedia.org/wiki/Fast_Fourier_transform), compute the ATH for the block (considering the masking effects).
   4. Let $R^*$ the bit-rate selected by the user. While the generated bit-rate $R\leq R^*$:
     1. Decrement the quantization step $\Delta_b$ for each subband $b$, proportionally to the ATH in $b$. Compute $R$.
   5. Output $\{\Delta_b\}_{b=1}^{32}$ and the quantization indexes.

### Decoder

1. For each input frame:
   1. "Dequantize" the coeffs of each subband.
   2. Re-scale the coeffs to their original dynamic range.
   3. Apply the 32-band synthesis filters bank.

## [Layer II](https://en.wikipedia.org/wiki/MPEG-1_Audio_Layer_II)

* 8:1 compression (174 kbps).
* CBR (Constant Bit-Rate).
* Increases block-size to $3\times 12\times 32=1152$ samples.

## [Layer III](https://en.wikipedia.org/wiki/MP3)

* CBR and ABR (Averate Bit-Rate).
* Typically, 128 kbps.
* Increases block-size to $3\times 12\times 32=1152$ samples.

\begin{itemize}
\item Developed by the MPEG Audio group (Fraunhofer IIS, University of
  Hannover, AT\&T-Bell Labs, Thomson-Brandt and CCETT (Centre Commun
  d'\'Etudes de T\'el\'evision et T\'el\'ecommunications)) in 1992.
\item Used in MP3 players, DVDs, etc.
\item Encoding algorithm:
  \begin{enumerate}
  \item Split the original PCM sequence into 32 PCM sequences, each
    one for a specific frequency subband (subband coding).
  \item Transform each subband using the MDCT.
  \item Determine the quantization values for each subband using a
    psycho-acoustic model.
  \item Lossless compress each MDCT subband sequence using Huffman coding.
\end{enumerate}
\end{itemize}
