asp_project_report.tex

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% University/School Laboratory Report
% LaTeX Template
% Version 3.1 (25/3/14)
%
% This template has been downloaded from:
% http://www.LaTeXTemplates.com
%
% Original author:
% Linux and Unix Users Group at Virginia Tech Wiki 
% (https://vtluug.org/wiki/Example_LaTeX_chem_lab_report)
% 
% License:
% CC BY-NC-SA 3.0 (http://creativecommons.org/licenses/by-nc-sa/3.0/)
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%----------------------------------------------------------------------------------------
%	PACKAGES AND DOCUMENT CONFIGURATIONS
%----------------------------------------------------------------------------------------

\documentclass{article}

\usepackage{geometry}
\usepackage{listings}
\usepackage{siunitx} % Provides the \SI{}{} and \si{} command for typesetting SI units
\usepackage{graphicx} % Required for the inclusion of images
\usepackage{amsmath} % Required for some math elements 

\setlength\parindent{0pt} % Removes all indentation from paragraphs

\renewcommand{\labelenumi}{\alph{enumi}.} % Make numbering in the enumerate environment by letter rather than number (e.g. section 6)

%\usepackage{times} % Uncomment to use the Times New Roman font

%----------------------------------------------------------------------------------------
%	DOCUMENT INFORMATION
%----------------------------------------------------------------------------------------

\title{Lossless Compression of Digital Audio Signals \\ ELE4ASP} % Title

\author{Joshua \textsc{Hickey}} % Author name

\date{\today} % Date for the report

\begin{document}

\maketitle % Insert the title, author and date

% If you wish to include an abstract, uncomment the lines below


\begin{abstract}

% Background
Data compression algorithms that do not approximate or truncate data during the reconstruction/deconstruction process are referred to as ``lossless''. Applications for lossless compression of audio data are largely for portability and convention, but are also a practical solution for archiving and production. An example of a popular lossless audio codec is ``FLAC''.\\

% Results
Fixed coefficient linear predictors were implemented to generate an error signal to be encoded, compressed and written to an output file. The encoding was performed by taking the error signal, and after calculating the parameter $k$ based off the sum of squared prediction error, generating ``Rice codes'' for the error signal. Basic data compression was performed by writing only descriptive bits in the encoded symbol to the output file.\\

% Conclusions
It was observed that more accurate prediction yields better compression performance, as there will be a lower sum of squared error in each block of data and therefore shorter ``Rice codes'' can be generated. With poor prediction, the encoding algorithm generates inefficient symbols due to a high amount of appending zeroes (caused by prediction error signals with high value ``upper-nibbles''. \\

Specifically, data compression in the order of ninety percent of the original data size was consistently observed for ideal encoding conditions, and some data expansion was observed in the case that a non ideal $k$ parameter derivation consistently generated codes larger than the original sample. \\ 

\end{abstract}

%----------------------------------------------------------------------------------------
%	SECTIONS
%----------------------------------------------------------------------------------------
\section*{Algorithm}

\subsection*{Prediction}

Several linear prediction models were used in development and testing of the compression algorithm.\\

% Average predictor
\begin{equation}
S_p(i) = \sum_{j=1}^{N=2} 0.5s(i-j) \\
\end{equation}

\begin{equation}
S_p(i) = s(i-1) \\
\end{equation}

\begin{equation}
S_p(i) = 2s(i-1) - s(i-2) \\ \\ 
\end{equation}

Experiments conducted in order to make observations of the compression performance and other encoding factors used the following linear predictor. \\

\begin{equation}
S_p(i) = 3s(i-1) - 3s(i-2) + s(i-3) \\
\end{equation}

The iterative process of the linear prediction library is described in the following block diagram.\\
\begin{figure}[!htb]
\centering
\includegraphics[scale=.3]{Diagrams/prediction.eps}
\caption{Linear prediction iteration}
\label{fig:predict}
\end{figure}

\subsection*{Rice Coding}
The following block diagram describes the algorithm used to encode the prediction error, assume the diagram refers to a single byte sample of the error signal.\\

\begin{figure}[!htb]
\centering
\includegraphics[scale=.3]{Diagrams/rice.eps}
\caption{Rice coding process}
\label{fig:rice}
\end{figure}

The encoding scheme produced a symbol in the following format.\\

\begin{center}
\begin{tabular}{|r|c|c|l|}
  \hline
  \textbf{1} & $X$ (no. of zeroes) & \textbf{1} & last $k$ bits \\
  \hline
\end{tabular}
\end{center}

\subsection*{Bit and File Operations}
A simple bit operations library was written for the purpose of pushing (bit-by-bit) the encoded symbol to a byte buffer which, when ``full'', would be written to the output file as a single character. The relevant bits of the codes were masked out and simply shifted into the next available space in the byte buffer.\\ 

\section*{Implementation}
The C programming language was used to implement every component of this project, proving useful in handling bit level file operations for data compression and simple implementation of signal processing for samples in a block of data using loop structures.\\
 
The following diagram describes the system level implementation of components within the compression algorithm program.\\

\begin{figure}[!htb]
\centering
\includegraphics[scale=.3]{Diagrams/system.eps}
\caption{System diagram of compression program}
\label{fig:digraph}
\end{figure}

A file read was read in by the program, blocks of data from the file were accessed by the prediction library and another block of data generated based on the eror calculated. The error signal was then encoded using the rice coding library and written to the output file along with data parameters required for decoding ($k$, block size, predictor used). The program checks for any unread data in the file after performing block processing and iterates through the remaining data as usual (just with adjusted memory allocation as a result of a different block size).\\
 
\section*{Results}

The following experiment was conducted using a 2562 byte .wav file, using a range of block sizes in an attempt to observe correlating features between compression efficiency and parameters calculated using block size or attributes of the data contained within the block.


\begin{table}[h]  
  \centering
  \begin{tabular}{|c|c|c|c|}
    \hline
    \textbf{Input file size} & \textbf{Output file size} & \textbf{Block size} & \textbf{Average K} \\
    \hline
    \hline
    2562 & 2282 & 1700 & 5\\
    \hline
    2562 & 2425 & 1500 & 6\\
    \hline
    2562 & 2779 & 1400 & 7\\
    \hline
    2562 & 2395 & 900 & 5\\
    \hline
    2562 & 2858 & 200 & 6\\
    \hline
  \end{tabular}
  \caption{Table of results comparing file sizes in bytes}
  \label{tab:results}
\end{table}

Compression performance varied greatly for block size, largely because the parameter $k$ is calculated from the sum of squared error of the error signal in the block of data, and as the the prediction was not optimal, an erratic error signal was observed and therefore larger $k$ values were used in the encoding scheme, generating longer codes. \\

\section*{Conclusion}
It was demonstrated that given an accurate prediction model of the input signal, ``Rice codes'' are a viable and efficient solution to losslessly compress data by enabling the description of a signal sample in shorter bit widths than the literal value. \\ 

Or more conclusively, the prediction model is pivotal in increasing the compression performance of the encoder by minimizing prediction errors and therefore allowing the use of smaller symbols to describe the intact input signal. \\

\newpage
\appendix
\newgeometry{left=2cm}
\section*{Source Code} \label{App:code}
% the \\ insures the section title is centered below the phrase: AppendixA
\subsection*{Main Routine}
\lstinputlisting{asp_project.c}
\newpage
\subsection*{Prediction}
\lstinputlisting{prediction.c}
\lstinputlisting{prediction.h}
\newpage
\subsection*{Rice Coding}
\lstinputlisting{encoder.c}
\lstinputlisting{encoder.h}
\newpage
\subsection*{File IO}
\lstinputlisting{file_io.c}
\lstinputlisting{file_io.h}

%----------------------------------------------------------------------------------------


\end{document}