## Audio Signal

<strong>Representation of Sound</strong>

An audio signal is a representation of sound as an electrical (analog) or digital signal that can be stored, manipulated, or transmitted.

<strong>Encodes All Information for Reproduction</strong>

It contains all necessary information (frequency, amplitude, phase, timing) needed to accurately reconstruct the original sound.
<hr>

<img src="../images/analog_digital.png">

<hr>
<h3>Analog Signal</h3>
<ul>
  <li><strong>Continuous Time:</strong> Defined at every instant—no gaps between values.</li>
  <li><strong>Continuous Amplitude:</strong> Can take any value within a range.</li>
  <li><strong>Waveform:</strong> Smooth and continuous (e.g., sine wave).</li>
<li><strong>Examples:</strong>
    <ul>
      <li>Microphone output</li>
      <li>Vinyl records</li>
      <li>FM/AM radio signals</li>
    </ul>
  </li>
  <li><strong>Noise Sensitivity:</strong> Easily affected by electrical interference.</li>
  <li><strong>Storage/Transmission Issues:</strong> Can degrade over time or distance without correction.</li>
  <li><strong>Analog vs. Digital:</strong> Analog is continuous, while digital is discrete and easier to process with computers.</li>
</ul>
<img src="../images/analog_signal.png">

<hr>
<h3>Digital Signal</h3>
<ul>
  <li><strong>Discrete Values:</strong> Represented by a sequence of individual data points.</li>
  <li><strong>Finite Amplitude Levels:</strong> Amplitude can only take specific predefined values.</li>
  <li><strong>Sampling & Quantization:</strong>
    <ul>
      <li>Sampling: Measuring the signal at regular time intervals.</li>
      <li>Quantization: Rounding the sampled values to the nearest allowed level.</li>
    </ul>
  </li>
  <li><strong>Noise Resistance:</strong> More robust to small distortions or interference.</li>
  <li><strong>Examples:</strong>
    <ul>
      <li>Digital audio (MP3, WAV)</li>
      <li>Compact Discs (CDs)</li>
      <li>Digital TV and communication</li>
    </ul>
  </li>
  <li><strong>Storage & Transmission:</strong> Can be copied, compressed, and transmitted efficiently.</li>
  <li><strong>Requires Conversion:</strong> Uses ADC to digitize and DAC for playback.</li>
</ul>
<img src="../images/sampling.png">
<hr>
<img src="../images/sampling_period.png">
<hr>
<img src="../images/sampling_rate.png">

<hr>
<h2>Why Sampling Rate = 44,100 Hz?</h2>

<h3>1. Nyquist-Shannon Sampling Theorem</h3>
<blockquote>
  To accurately reconstruct a signal, the sampling rate must be at least <strong>twice the highest frequency</strong> present in the signal.
</blockquote>
<ul>
  <li>Human hearing typically ranges from <strong>20 Hz to ~20,000 Hz</strong>.</li>
  <li>To capture all audible frequencies without distortion:</li>
</ul>
<pre>
Minimum sampling rate = 2 × 20,000 Hz = 40,000 Hz
</pre>
<p>This minimum required rate is called the <strong>Nyquist rate</strong>.</p>

<hr>

<h3>2. Why Not Just 40,000 Hz? Why 44,100 Hz?</h3>
<ul>
  <li><strong>Buffer for filtering:</strong> Anti-aliasing filters aren’t perfect. 44.1 kHz provides a margin to better remove unwanted frequencies above 20 kHz.</li>
  <li><strong>Historical reason:</strong>
    <ul>
      <li>In early digital audio, 44.1 kHz matched well with analog video systems.</li>
      <li>Digital audio was stored using PCM on video tape using NTSC and PAL formats.</li>
      <li>44,100 samples per second = 3 × 14,700, which fit evenly into video frames.</li>
    </ul>
  </li>
</ul>

<hr>

<h3>Summary Table</h3>
<table border="1" cellpadding="6">
  <tr>
    <th>Term</th>
    <th>Meaning</th>
  </tr>
  <tr>
    <td><strong>Sampling Rate</strong></td>
    <td>Number of samples per second taken from an analog signal</td>
  </tr>
  <tr>
    <td><strong>Nyquist Frequency</strong></td>
    <td>Half of the sampling rate; max frequency that can be accurately encoded</td>
  </tr>
  <tr>
    <td><strong>Aliasing</strong></td>
    <td>Distortion from under-sampling frequencies above the Nyquist limit</td>
  </tr>
</table>

<hr>

<h3>So, why 44,100 Hz?</h3>
<ul>
  <li>✔️ Covers full <strong>human hearing</strong> range (up to 20 kHz)</li>
  <li>✔️ Meets the <strong>Nyquist requirement</strong></li>
  <li>✔️ Allows room for <strong>filter design</strong></li>
  <li>✔️ Matches <strong>legacy video formats</strong></li>
</ul>
<hr>

<h2>What is Aliasing in Audio?</h2>

<img src="../images/aliasing.png">


> Watch: YouTube — [The intuition behind the Nyquist-Shannon Sampling Theorem](https://www.youtube.com/watch?v=Jv5FU8oUWEY)
](https://www.youtube.com/watch?v=Jv5FU8oUWEY)
<hr>
<h3>Definition</h3>
<p>
  <strong>Aliasing</strong> is a type of distortion that occurs when a continuous signal is <strong>sampled at a rate too low</strong> to accurately capture its frequency content.
</p>
<p>
  According to the <strong>Nyquist-Shannon Sampling Theorem</strong>:
</p>
<pre>
Sampling Rate ≥ 2 × Maximum Frequency
</pre>
<p>
  If the sampling rate is less than this threshold, high-frequency components will be incorrectly represented as lower frequencies — these are called <strong>aliased frequencies</strong>.
</p>

<hr>

<h3>Example</h3>
<p>
  Suppose you have a sine wave at <strong>10 kHz</strong>, but you sample it at <strong>15 kHz</strong>.
</p>
<ul>
  <li>Nyquist frequency = 15 kHz ÷ 2 = <strong>7.5 kHz</strong></li>
  <li>Since 10 kHz &gt; 7.5 kHz → <strong>Aliasing will occur</strong></li>
  <li>The signal will be incorrectly interpreted as a <strong>5 kHz</strong> wave</li>
</ul>

<hr>

<h3>Key Points</h3>
<table border="1" cellpadding="6">
  <tr>
    <th>Concept</th>
    <th>Explanation</th>
  </tr>
  <tr>
    <td><strong>Nyquist Frequency</strong></td>
    <td>Half the sampling rate; max frequency that can be accurately captured</td>
  </tr>
  <tr>
    <td><strong>Aliased Frequency</strong></td>
    <td>A false frequency that appears when sampling too slowly</td>
  </tr>
  <tr>
    <td><strong>Solution</strong></td>
    <td>Use a higher sampling rate or apply a low-pass (anti-aliasing) filter before sampling</td>
  </tr>
</table>

<hr>

<h3>How to Prevent Aliasing</h3>
<ol>
  <li><strong>Use a high enough sampling rate</strong> (e.g., 44.1 kHz for audio up to 20 kHz).</li>
  <li><strong>Apply an anti-aliasing filter</strong> — a low-pass filter that removes frequencies above the Nyquist limit before sampling.</li>
</ol>

<hr>

<h3>Visual Analogy</h3>
<p>
  Think of it like filming a spinning wheel:
</p>
<ul>
  <li>If the frame rate is too low, the wheel might appear to spin backward.</li>
  <li>That's visual aliasing — in audio, aliasing makes high-pitched sounds appear as lower ones.</li>
</ul>


<hr>
<h2>What is <u>Quantization</u> in Digital Audio?</h2>

<img src="../images/sine_wave_quantisation.png">

<p>
Quantization is the process of <strong>mapping a range of continuous amplitude values</strong> from an analog audio signal into a <strong>finite set of discrete levels</strong> during digital conversion.
</p>

<hr>

<h3>How It Works</h3>
<ol>
  <li>The analog signal is first <strong>sampled</strong> at regular time intervals (e.g., 44,100 times per second).</li>
  <li>Each sample has an amplitude (loudness) value, which is then <strong>rounded to the nearest quantization level</strong>.</li>
  <li>The number of available levels depends on the <strong>bit depth</strong>.</li>
</ol>

<hr>

<h3>Resolution & Bit Depth</h3>
<ul>
  <li><strong>Bit Depth</strong> = Number of bits used to store each sample.</li>
  <li><strong>Resolution</strong> = How many unique amplitude levels are available.</li>
</ul>

<pre>
Resolution = 2<sup>n</sup> levels, where n = number of bits
</pre>

<ul>
  <li><strong>8-bit audio</strong> = 256 levels (2<sup>8</sup>)</li>
  <li><strong>16-bit audio (CD quality)</strong> = 65,536 levels (2<sup>16</sup>)</li>
  <li><strong>24-bit audio</strong> = 16,777,216 levels (2<sup>24</sup>)</li>
</ul>

<hr>

<h3>CD Audio Example</h3>
<ul>
  <li><strong>Sample rate</strong>: 44,100 samples per second</li>
  <li><strong>Bit depth</strong>: 16 bits per sample</li>
  <li>This gives a <strong>dynamic range</strong> of about 96 dB</li>
</ul>

<hr>

<h3>Quantization Error</h3>
<p>
Because the signal is rounded to the nearest level, there is a small amount of error introduced. This is called <strong>quantization noise</strong> or error.
</p>

<ul>
  <li>It is more noticeable in low-bit-depth audio (like 8-bit).</li>
  <li>Higher bit depths result in more accurate audio with less quantization error.</li>
</ul>

<hr>

<h3>Summary Table</h3>
<table border="1" cellpadding="6">
  <tr>
    <th>Term</th>
    <th>Description</th>
  </tr>
  <tr>
    <td><strong>Quantization</strong></td>
    <td>Rounding continuous amplitude values to discrete levels</td>
  </tr>
  <tr>
    <td><strong>Bit Depth</strong></td>
    <td>Number of bits used per sample (e.g., 16-bit)</td>
  </tr>
  <tr>
    <td><strong>Resolution</strong></td>
    <td>Number of distinct amplitude levels (2<sup>bit depth</sup>)</td>
  </tr>
  <tr>
    <td><strong>Quantization Error</strong></td>
    <td>Difference between actual signal and quantized value (noise)</td>
  </tr>
</table>

<hr>

<h3>Key Insight</h3>
<p>
Quantization is what allows us to <strong>store audio in binary form</strong>, but it introduces slight inaccuracies. The more bits you use, the more precise the digital representation becomes!
</p>


<hr>
<h2>Memory Calculation for 1 Minute of Audio</h2>

<h3>Given:</h3>
<ul>
  <li><strong>Sampling Rate</strong> = 44,100 Hz (samples per second)</li>
  <li><strong>Bit Depth</strong> = 16 bits (per sample)</li>
  <li><strong>Duration</strong> = 60 seconds</li>
</ul>

<hr>

<h3>Formula</h3>
<pre>
Memory (bits) = Sampling Rate × Bit Depth × Duration
Memory (bytes) = Memory (bits) ÷ 8
</pre>

<h3>For Mono (1 Channel):</h3>
<pre>
Memory = (44,100 × 16 × 60) ÷ 8
       = 52,920,000 bits ÷ 8
       = 6,615,000 bytes ≈ 6.3 MB
</pre>

<h3>For Stereo (2 Channels):</h3>
<pre>
Memory = 6,615,000 × 2
       = 13,230,000 bytes ≈ 12.6 MB
</pre>

<hr>

<h3>Summary Table</h3>
<table border="1" cellpadding="6">
  <tr>
    <th>Channels</th>
    <th>Memory for 1 Minute</th>
  </tr>
  <tr>
    <td>Mono (1 channel)</td>
    <td>~6.3 MB</td>
  </tr>
  <tr>
    <td>Stereo (2 channels)</td>
    <td>~12.6 MB</td>
  </tr>
</table>

<hr>

<h3>Key Insight</h3>
<p>
Higher <strong>sampling rates</strong> and <strong>bit depths</strong> improve audio quality but increase memory usage.
</p>
<p>
<strong>Stereo audio</strong> doubles the memory required compared to mono.
</p>


<hr>
<h2>What is <u>Dynamic Range</u> in Audio?</h2>

<p>
Dynamic range refers to the <strong>difference between the quietest and loudest sound</strong> a recording system or device can accurately capture or reproduce.
</p>

<hr>

<h3>Definition</h3>
<blockquote>
Dynamic Range = Loudest Level (Max) − Quietest Level (Min)
</blockquote>

<p>
It's usually measured in <strong>decibels (dB)</strong>. The higher the dynamic range, the more subtle details can be preserved without distortion or noise.
</p>

<hr>

<h3>How It Relates to Bit Depth</h3>
<p>
Dynamic range in digital systems is directly related to <strong>bit depth</strong>. Each additional bit increases the range of possible amplitude values.
</p>

<pre>
Dynamic Range ≈ 6.02 × Bit Depth (in dB)
</pre>

<ul>
  <li>8-bit audio → ~48 dB dynamic range</li>
  <li>16-bit audio (CD) → ~96 dB</li>
  <li>24-bit audio → ~144 dB</li>
</ul>

<hr>

<h3>📌 Real-World Implications</h3>
<ul>
  <li><strong>Too low dynamic range</strong>: Quiet details may be lost in noise.</li>
  <li><strong>Too high signal</strong>: May cause clipping (distortion at upper limit).</li>
  <li><strong>Professional recordings</strong> use higher bit depths to capture subtle detail without introducing noise.</li>
</ul>

<hr>

<h3>Summary Table</h3>
<table border="1" cellpadding="6">
  <tr>
    <th>Bit Depth</th>
    <th>Approx. Dynamic Range</th>
  </tr>
  <tr>
    <td>8-bit</td>
    <td>~48 dB</td>
  </tr>
  <tr>
    <td>16-bit (CD)</td>
    <td>~96 dB</td>
  </tr>
  <tr>
    <td>24-bit</td>
    <td>~144 dB</td>
  </tr>
</table>

<hr>

<h3>Audio Insight</h3>
<p>
Dynamic range is what lets you hear both the softest whisper and the loudest explosion in a high-quality audio track — without distortion or noise.
</p>
