# 7.1 Sampling of Signals and Signal Reconstruction from Samples

## 7.1.1 Sampling Theorem

Let the signal $x(t)$ have a bandwidth $W$ (bandlimited to $W$), i.e., if $X(f) = 0$ for $|f| \ge W$. Let $x(t)$ be sampled at multiples of some basic sampling intervals $T_s$, when $T_s \le {1\over 2W}$, to yield the sequence $\{x(nT_s)\}^{+\infty}_{n=-\infty}$, then it is possible to reconstruct the original signal $x(t)$ from sampled values by the reconstruction formula

\begin{equation}
x(t) = \sum_{n = -\infty}^{\infty} 2W' T_s x(nT_s) sinc(2W'(t-nT_s))
\end{equation}

- where $W'$ is any arbitrary number that satisfies the condition $W \le W' \le {1\over T_s} - W$, in the special case where $T_s = {1\over 2W}, W' = W = {1\over 2T_s}$:

\begin{equation}
x(t) = \sum_{n = -\infty}^{\infty} x(nT_s) sinc({t \over T_s} - n)
\end{equation}

**Proof** Let $x_{\delta}(t) = \sum_{n=-\infty}^{\infty}x(nT_s) \delta(t - nT_s) = x(t) \sum_{n=-\infty}^{\infty} \delta(t - nT_s)$

- take Fourier transform
\begin{equation}
X_{\delta}(f) = X(f) \otimes \mathcal{F}(\sum_{n=-\infty}^{\infty} \delta(t - nT_s)) \\
\mathcal{F}(\sum_{n=-\infty}^{\infty} \delta(t - nT_s)) = {1\over T_s} \sum_{n = -\infty}^{\infty}\delta(f - {n \over T_s}) \\
X_{\delta}(f) = X(f) \otimes {1\over T_s} \sum_{n = -\infty}^{\infty}\delta(f - {n \over T_s}) = {1\over T_s} \sum_{n=-\infty}^{\infty} X(f - {n\over T_s})
\end{equation}

- if $T_s > {1\over 2W}$, then the replicated spectrum of $x(t)$ overlaps and reconstruction of the original signal is not possible
    - this type of distortion results from undersampling, known as **aliasing error** or **aliasing distortion**
- if $T_s \le {1\over 2W}$, no overlap occurs, and by employing an appropriate filter we can reconstruct the original signal

<img src="img/Snip20191114_5.png" width=80%/>

- to acquire the original signal, filter the sampled signal through a lowpass filter with frequency response characteristics
    1. $H(f) = T_s$ for $|f| < W$
    2. $H(f) = 0$ for $|f| \ge {1\over T_s} - W$
- ideal BPF: $H(f) = T_s \Pi({f\over 2W'})$, where $W \le W' < {1\over T_s} - W$

\begin{equation}
X(f) = X_{\delta}(f) T_s \Pi({f \over 2W'})
\end{equation}


\begin{equation}
x(t) = x_{\delta}(t) \otimes 2W' T_s sinc(2W' t) \\
= (\sum_{n=-\infty}^{\infty} x(nT_s) \delta(t - nT_s)) \otimes 2W' T_s sinc(2W' t) \\
= \sum_{n=-\infty}^{\infty} 2W' T_s x(nT_s) sinc(2W'(t - nT_s))
\end{equation}

- the sampling rate $T_s = {1\over 2W}$ is the minimum sampling rate at which no aliasing occurs, known as the **Nyquist sampling rate**
- if sampling is done at the Nyquist rate, then the only choice for reconstruction filter is an ideal LPF and $W' = W = {1\over 2T_s}$

\begin{equation}
x(t) = \sum_{n=-\infty}^{\infty} x({n \over 2W}) sinc(2Wt - n) = \sum_{n=-\infty}^{\infty}x(nT_s) sinc({t \over T_s} - n)
\end{equation}


**End Proof**

- in practical systems, sampling is done at a rate higher than the Nyquist rate, which allows for the reconstruction filter to be realizable and easier to build
- the distance between the two adjacent replicated spectra in the frequency domain $W_G = ({1\over T_s - W}) - W = f_s - 2W$ is known as the **guard band**



# 7.2 Quantization

- after sampling, we have a discrete-time signal (signal with values at integer multiples of $T_s$)
- the amplitudes of these signals are still continuous
- after sampling, we use quantization, in which the amplitude becomes discrete as well
- after the quantization step, we will deal with a discrete-time, finite-amplitude signal, in which each sample is represented by finite number of bits

## 7.2.1 Scalar Quantization

<img src="img/Snip20191116_29.png" width=60%/>

- each sample is quantized into one of a finite number of levels, which is then encoded into a binary representation
- the quantization process is a rounding process; each sampled signal point is rounded to the nearest value from a finite set of possible quantization levels


A set of real numbers $\mathbb{R}$ is partitioned into $N$ disjoint subsets denoted by $\mathcal{R}_k, 1 \le k \le N$ (each called a **quantization region**);

corresponding to each subset $\mathcal{R}_k$, a *representation point (or quantization level)* $\hat{x}_k$ is chosen, which usually belongs to $\mathcal{R}_k$;

if the sampled signal at time $i$, $x_i$ belongs to $\mathcal{R}_k$, then it is represented by $\hat{x}_k$, which is the **quantized version** of x;

then $\hat{x}_k$ is represented by a binary sequence and trasmitted, which is called *encoding*

for $N$ possible quantized levels, $\log_2N$ bits are enough to encode these levels in binary sequence => number of bits required to transmit each source output is $R= \log_2N$ bits

The representation point (quantized value) in each region is denoted by $\hat{x}_i$ and the quantization function $Q$ is defined by

\begin{equation}
Q(x) = \hat{x}_i \textrm{ for all } x \in \mathcal{R}_i
\end{equation}

we can define the **average distortion** resulting from quantization; the **squared error distortion** is defined as $(x-\hat{x})^2 = (x-Q(x))^2$;
\begin{equation}
d(x, \hat{x}) = (x - Q(x))^2 = (x - \hat{x})^2 = \tilde{x}^2
\end{equation}

**Definition** since $X, \hat{X}, \tilde{X}$ are random variables, the average (mean squared error) distortion is given by
\begin{equation}
D = E[d(X, \hat{X})] = E(X - Q(X))^2
\end{equation}

**Definition** if random variable $X$ is quantized to $Q(X)$, the **siganl-to-quantization-noise-ratio** (SQNR) is defined by
\begin{equation}
\textrm{SQNR} = {E(X^2) \over E(X - Q(X))^2}
\end{equation}

When dealing with random signal $X(t)$: **Quantization Noise Power** $P_{\tilde{X}} = \lim_{T \rightarrow \infty} {1\over T} \int_{-T/2}^{T/2} E(X(t) - Q(X(t)))^2 dt$

When dealing with signals: **Signal Power** $P_X = \lim_{T \rightarrow \infty} {1\over T} \int_{-T/2}^{T/2} E(X^2(t)) dt$

When dealing with signals: $SQNR = {P_X \over P_{\tilde{X}}}$


**Example** input sampled signal is in the range $[-x_{max}, x_{max}]$ and the number of quantization levels is $N = 2^v$; 
then for a uniform quantizer, the partition regions size is $\Delta = {2x_{max} \over N}$; the quantized values are chosen to be the midpoints of the quantization regions; therefore, the quantization error $\tilde{X} = X - Q(X)$ is a random variable taking values in the interval $(-\Delta / 2, \Delta / 2]$; for large enough $N$, the error can be approximated by a uniformly distributed random variable over $(-\Delta / 2, \Delta / 2]$; 

pdf given by
\begin{equation}
f_{\tilde{X}}(\tilde{x}) = \begin{cases}
    {1\over \Delta}, -{\Delta \over 2} < \tilde{x} \le {\Delta \over 2} \\
    0, otherwise
\end{cases}
\end{equation}

thus 
\begin{equation}
E[\tilde{X}^2] = {1\over \Delta} \int_{-\Delta \over 2}^{\Delta \over 2} \tilde{x}^2 d\tilde{x} = {\Delta^2 \over 12} = {x_{max}^2 \over 3\times 4^v}
\end{equation}

and **$\Delta\Delta\Delta\Delta$**
\begin{equation}
SQNR|_{db} \approx 10\log_{10} {P_X \over x_{max}^2} + 6v + 4.8
\end{equation}
- each extra quantization bit increases the SQNR by 6 dB

### Uniform Quantization

- in a uniform quantizer, the entire real time is partitioned into $N$ regions; all regions except $\mathcal{R}_1$ and $\mathcal{R}_N$ are of equal length, which is denoted by $\Delta$

- for all $1 \le i \le N-2$, we have $a_{i+1} - a_i = \Delta$

- it is further assumed that the quantization levels are at a distance of $\Delta \over 2$ from the boundaries $a_1, a_2, ..., a_{N - 1}$

- in a uniform quantizer, the mean squared error distortion is given by

\begin{equation}
D = \int_{-\infty}^{a_1} (x - (a_1 - \Delta /2))^2 f_X(x) dx \\
+ \sum_{i = 1}^{N - 2} \int_{a_1 + (i - 1)\Delta}^{a_1 + i \Delta} (x - (a_1 + i\Delta - \Delta / 2))^2 f_X(x) dx \\
+ \int_{a_1 + (N - 2)\Delta}^{\infty} (x - (a_1 + (N - 2)\Delta + \Delta/ 2))^2 f_X(x) dx
\end{equation}

- so $D$ is a function of two design parameters $a_1$ and $\Delta$; in order to design the optimal uniform quantizer, we have to differentiate $D$ w.r.t. these variables and find the values that minimize $D$



### Nonuniform Quantization

- the partition regions are unequal and one has more degrees of freedom to optimize the equantization
- assume we are interested in designing the optimal mean squared error quantizer with $N$ levels of quantization with no other constraints on the regions, average distortion is given by:

\begin{equation}
D = \int_{-\infty}^{a_1} (x - \hat{x}_1)^2 f_X(x) dx \\
+ \sum_{i = 1}^{N - 2} \int_{a_i}^{a_{i+1}} (x - \hat{x}_{i+1})^2 f_X(x) dx \\
+ \int_{a_{N - 1}}^{\infty} (x-\hat{x}_N)^2 f_x(x) dx
\end{equation}

- there exists a total of $2N - 1$ variables in the expression $a_1, a_2, ..., a_{N - 1}$ and $\hat{x}_1, \hat{x}_2, ..., \hat{x}_N$

\begin{equation}
{\partial D \over \partial a_i} = f_X(a_i)((a_i - \hat{x}_i)^2 - (a_i - \hat{x}_{i+1})^2) = 0
\end{equation}

yields

\begin{equation}
a_i = {1\over 2} (\hat{x}_i + \hat{x}_{i+1})
\end{equation}

- in an optimal quantizer, the *boundaries of the quantization regions are the midpoints of the quantized values* 

\begin{equation}
{\partial D \over \partial \hat{x}_i} = \int_{a_{i - 1}}^{a_i} 2(x-\hat{x}_i) f_X(x) dx = 0
\end{equation}

\begin{equation}
\hat{x}_i = {\int_{a_{i - 1}}^{a_i}  xf_X(x) dx \over \int_{a_{i - 1}}^{a_i}  f_X(x) dx}
\end{equation}

- in an optimal quantizer, the *quantized value (or representation point) for a region should be chosen to be the centroid of that region*

- these are known as Lloyd-Max conditions
    1. the boundaries of the quantization regions are the midpoints of the corresponding quantized values
    2. the quantized values are the centroids of the quantization regions



# 7.4 Waveform Coding

- waveform coding schemes are designed to reproduce the waveform output of the source at the destination with as little distortion as possible


## 7.4.1 Pulse Code Modulation (PCM)

- consists of three basic sections, a sampler, a quantizer and an encoder

<img src="img/Snip20191117_30.png" width=80%/>

- $g(t)$: continuous-time, continuous-amplitude analog signal
- $g_s(t)$: discrete-time, continuous-amplitude discrete signal
- $\{g_n\}$: discrete-time, discrete-amplitude digital signal


assumptions
1. the waveform signal is bandlimited with a maximum frequency of $W$, thus can be fully reconstructed from samples taken at a rate of at least $f_s = 2W$ 
2. the signal is of finite amplitude, s.t. exists a maximum amplitude $x_{max}$ such that for all $t$, we have $|x(t)|\le x_{max}$
3. the quantization is done with a large number of quantization levels $N = 2^v$ 

### Uniform PCM

- assume that the quantizer is uniform
- the range of the input samples: $[-x_{max}, +x_{max}]$
- the number of quantization levels is $N$
- length of each quantization region is $\Delta = {2x_{max} \over N} = {x_{max} \over 2^{v - 1}}$
- the quantized values in uniform PCM are chosne to be the midpoints of the quantization regions
    - error $\tilde{x} = x - Q(x)$ is a random variable taking values in the interval $(-{\Delta \over 2}, +{\Delta \over 2}]$
    - under the assumption that $N$ is high, $x_{max}$ and $\Delta$ are small
    - the error $\tilde{X} = X - Q(X)$ can be approximated by a uniformly distributed random variable on $(-{\Delta \over 2}, +{\Delta \over 2}]$

\begin{equation}
f(\tilde{x}) = \begin{cases}
    {1\over\Delta}, -{\Delta \over 2} \le \tilde{x} \le {\Delta \over 2} \\
    0, otherwise
\end{cases}
\end{equation}

- distortion introduced by quantization (quantization noise):
\begin{equation}
E[\tilde{X}^2] = \int_{-{\Delta \over 2}}^{+{\Delta \over 2}} {1\over \Delta} \tilde{x}^2 d\tilde{x} = {\Delta^2 \over 12} = {x_{max}^2 \over 3N^2} = {x_{max}^2 \over 3 * 4^v}
\end{equation}

- where $v$ is the number of bits per source sample

\begin{equation}
SQNR = {P_X \over \bar{\tilde{X}^2}} = {3N^2P_X \over x_{max}^2} = {3*4^vP_X \over x_{max}^2}
\end{equation}

- where $P_X$ is the power in each sample
\begin{equation}
P_X = R_X(\tau) |_{\tau = 0} = \int_{-\infty}^{\infty} S_X(f) df = \int_{-\infty}^{\infty} x^2 f_X(x)dx
\end{equation}

- $P_X = E[X^2] \le x_{max}^2$

\begin{equation}
SQNR|_{dB} \approx 10 \log_{10} {P_X \over x_{max}^2} + 6v + 4.8
\end{equation}

- each extra bit (increase in $v$) increases the SQNR by 6 dB

#### Bandwidth Considerations

- if a signal has a bandwidth of $W$
- the minimum number of samples for perfect reconstruction of the signal is given by the sampling theorem which is $2W$ samples per sec
- if guardband is required, the number of samples per second is $f_s > 2W$
- for each sample, $v$ bits are used, thus a total of $vf_s$ bits/sec are required for transmission of the PCM signal
- the minimum bandwidth requirement for binary transmission of $R$ bits/sec is $R \over 2$
- thus the minimum bandwidth requirement of a PCM system is $BW_{req} = {vf_s \over 2}$

- the PCM system expands the bandwidth of the original signal by a factor of **at least** $v$

## 7.4.2 Differential Pulse Code Modulation

- in a PCM system, after sampling the signal, each sample is quantized independently using a scalar quantizer
    - previous samples have no effect on the quantization of the new samples
- when a bandlimited random process is sampled at the Nyquist rate or faster, the sampled values are usually correlated random variables


In the simplest form of **DPCM**, the difference between two adjacent samples is quantized; because two adjacent samples are highly correlated, their difference has small variations; therefore to achieve a certain level of performance, fewer levels (and fewer bits) are required to quantize it.

<img src="img/Snip20191117_33.png"/>

- the input to the system is $X_n - \hat{Y}_{n-1}'$
- $\hat{Y}_{n-1}'$ is closely related to $X_{n-1}$
- the accumulation of quantization noise is prevented
- the input to the quantizer $Y_n$ is quantized by a scalar quantizer to prodce $\hat{Y}_n$

\begin{equation}
Y_n = X_n - \hat{Y}'_{n-1} \\
\hat{Y}'_n = \hat{Y}_{n} + \hat{Y}'_{n-1}
\end{equation}

the quantization error between the input and the output of the quantizer is
\begin{equation}
\hat{Y}_n - Y_n = \hat{Y}_n - (X_n - \hat{Y}'_{n-1}) \\= \hat{Y}_n - X_n +\hat{Y}'_{n-1} \\= \hat{Y}'_n - X_n
\end{equation}


at the receiving end, we have
\begin{equation}
\hat{X}_n = \hat{Y}_n + \hat{X}_{n-1}
\end{equation}

## 7.4.3 Delta Modulation

- simplified version of the DPCM system
- the quantizer is a one-bit quantizer with magnitudes $\pm \Delta$

<img src="img/Snip20191117_34.png"/>

- the quantization noise will be high unless the dynamic range of $Y_n$ is very low => $X_n$ and $X_{n-1}$ must have a **very high correlation** coefficient => to achieve which, the sample rates must be **much higher than Nyquist rate**

- sample rates are high, but since the number of bits per sample is only one, the total number of bits per second required may be lower than that of a PCM system

- advantage: simple structure of the system

\begin{equation}
\hat{X}_n - \hat{X}_{n-1} = \hat{Y}_n \\
\hat{X}_n = \sum_{i=0}^n \hat{Y}_i
\end{equation}

- to obtain $\hat{X}_n$, we only accumulate the values of $\hat{Y}_n$ 

<img src="img/Snip20191117_36.png"/>

- the step size $\Delta$ is very important parameter
    - large $\Delta$ allows modulator to follow rapid changes in the input; but cause excessive quantization noise when the input changes slowly (known as **granular noise**)
    - small $\Delta$ makes it hard for the modulator to track the rapid changes in the input => excessive quantization noise in the tracking period (known as **slope overload distortion**)
    
### Adaptive Delta Modulation

<img src="img/Snip20191117_37.png"/>

- change the step size according to the chanegs in the input
- the sign of two successive $\hat{Y}_n$'s is a good criterion for changing the step size; if the two successive outputs have the same sign, the step size should be increased, if they have opposite signs, it should be decreased

\begin{equation}
\Delta_n = \Delta_{n-1} K ^{\epsilon_n \times \epsilon_{n-1}}
\end{equation}

- $\epsilon_n$ is the output of the quantizer before being scaled by the step size
- $K$ is some constant larger than 1


**Exercise** The PSD of zero-mean WSS random process $X(t)$ is $S_x(f) = \Lambda({f\over 5000})$ and $max\{X(t)\} = 600$ (**unverified**)

1. what is the power content?

\begin{equation}
P_X = \int_{-\infty}^{\infty} S_x(f) df \\
= 2 \int_{0}^{5000} 1 - {x \over 5000} dx \\
= 2 * (x - {1\over 10000} x^2)|_0^{5000} \\
= 2 * (5000 - {1\over 2 * 5000} 5000^2) = 5000
\end{equation}


2. if this process is sampled at $f_s$ to guarantee a guard band of 2kHz, find $f_s$

\begin{equation}
f_s = 2W + 2kHz = 2 * 5000 Hz + 2kHz = 12 kHz
\end{equation}


3. if we use a uniform PCM system with 256 quantization levels on this process sampled at $f_s$, find resulting SQNR
    - $P_X$ = 5000, $x_{max} = max\{X(t)\} = 600$, $v = log_2(256) = 8$
\begin{equation}
SQNR \approx 10 \log_{10} {P_X \over x_{max}^2} + 6v + 4.8 = 34.226 dB \\
SQNR = {3 \times 4^v P_X \over x_{max}^2} = 34.362 dB
\end{equation}


4. what is the bit rate in part 3
    - the bit rate is $R_b = v f_s = 8 * 12 k bits/s = 96k bits /s$


5. if the output of the PCM system is to be transmitted using a binary system, what is the required minimum transmission bandwidth of the channel?
    - if the system is to be sampled at given $f_s$, then it requires ${vf_s \over 2} = 8 * 12 kHz / 2= 48 kHz$ of transmission bandwidth
    - if the system is to be sampled at Nyquist rate $2W$, then it requires $vW = 8 * 5000 Hz = 40kHz$


6. if we need to increase the SQNR by at least 25dB, find the required number of quantization levels, the resulting SQNR, and the required transmission bandwidth
    - 25dB => at least 5 more bits per sample
    - $v' = v + 5 = 13$ => $N = 2^v = 8192$
    - resulting SQNR: $SQNR = {3 \times 4^{v'} P_X \over x_{max}^2} = 64.46$
    - transmission bandwidth = ${vf_s \over 2}= 78 kHz$


# 7.7 The JPEG Image-Coding Standard

- widely used standard for lossy compression of still images
- belongs to the class of transform-coding techniques: do not compress the signal directly, but compress the transform of it
- the most widely used transform technique in image coding is DCT: discrete cosine transform
    - benefits: high degree of energy compaction properties 
    - results in transform coefficients, in which only a few of them have significant values



- the DCT of an $N\times N$ picture with luminance function $x(m, n), 0 \le m,n \le N-1$ can be obtained by:
\begin{equation}
X(0,0) = \frac{1}{N} \sum_{k=0}^{N-1}\sum_{l=0}^{N-1}x(k,l) \\
X(u, v) = \frac{2}{N} \sum_{k=0}^{N-1}\sum_{l=0}^{N-1} \cos[\frac{(2k+1)u\pi}{2N}] \cos[\frac{(2l+1)v\pi}{2N}], u,v \ne 0
\end{equation}


- the $X(0,0)$ coefficient is usually called the DC component, and the other coeeficients are called the AC components


- the JPEG encoder consists of three blocks: the DCT component, the quantizer, and the encoder


<img src="img/Snip20191125_39.png" width=80%/>


### The DCT Component

- a picture consists of many pixels in an $m\times n$ array
- first step in DCT transformation is to divide the image array into $8 \times 8$ subarrays
    - if the number of rows or columns is not a multiple of 8, then the last row/column is replicated to make it a multiple of 8
    - replications are removed at the decoder
- then the DCT of each subarray is computed; which generates 64 DCT coefficients for each subarray starting from the DC component $X(0,0)$ and going up to $X(7,7)$

### The Quantizer 

- due to energy compaction property of the DCT, only low-frequency components of the DCT coefficients have significant values
- the DC component carries most of the energy and since there exists a strong correlation between the DC component of a subarray and the DC component of the proceeding subarray
- a uniform differential quantization scheme is employed for quantization of DC components
- the AC components are quantized using a uniform quantization scheme
- all quantizers have the same number of quantization regions: 256
- a 64-element quantization table determines the step size for uniform quantization of each DCT component


### The Encoding

- quantization step provides lossy compression of the image
- entropy coding is employed to provide lossless compression of the quantized values



### Compression and Picture Quality in JPEG

- depending on the rate, JPEG can achieve high compression ratios with moderate-to-excellent image quality for both gray and color images
    - 0.2~0.5 bits/pixel for moderate-to-good quality pictures 
    - 0.75~1.5 bits/pixel for excellent quality image
    - 1.5~2 bits/pixel for practically indistinguishable from original