# Multimedia Processing Course - Part 3: Audio Processing

In this notebook, we explore Sound. We will look at audio not just as a waveform over time, but also understand its frequency content.

**Content:**
1.  **Level 1 (Basic)**: Loading and visualizing Waveforms.
2.  **Level 2 (Intermediate)**: Manipulating Audio (Volume, Speed).
3.  **Level 3 (Advanced)**: Frequency Domain (FFT) and Spectrograms.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.io.wavfile as wav

# Load the audio
samplerate, data = wav.read('datasets/sample_audio.wav')
print(f"Sample Rate: {samplerate} Hz")
print(f"Duration: {len(data)/samplerate:.2f} s")

### Explanation
We loaded the audio using `scipy.io.wavfile`.

## Level 1: Time Domain
The most basic visualization is the waveform: Amplitude vs Time.

In [None]:
# Create a time axis
time = np.linspace(0, len(data) / samplerate, num=len(data))

plt.figure(figsize=(12, 4))
plt.plot(time, data)
plt.title("Audio Waveform")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.xlim(0, 0.05) # Zoom in to first 0.05 seconds to see the wave
plt.show()

### Explanation
We use `np.linspace` to generate time values corresponding to each sample.

## Level 2: Simple Manipulation
Since audio is just a NumPy array, we can do math on it.

In [None]:
# Increase Volume (Multiply by constant)
quieter_audio = data * 0.5

# Reverse Audio
reversed_audio = data[::-1]

# Save the manipulated audio
wav.write('datasets/reversed_audio.wav', samplerate, reversed_audio.astype(np.int16))

### Explanation
`data * 0.5` halves the amplitude (quieter).
`data[::-1]` reverses the array (plays backwards).
`wav.write` saves it back to disk.

## Level 3: Frequency Domain (The Spectral View)
Sounds are vibrations. We often want to know *what frequencies* are present (e.g., detecting a specific musical note). We use the **Fourier Transform**.

In [None]:
# Perform FFT (Fast Fourier Transform)
fft_spectrum = np.fft.rfft(data)
freqs = np.fft.rfftfreq(len(data), 1/samplerate)

# Plot the magnitude spectrum
plt.figure(figsize=(12, 4))
plt.plot(freqs, np.abs(fft_spectrum))
plt.title("Frequency Spectrum")
plt.xlabel("Frequency (Hz)")
plt.ylabel("Magnitude")
plt.xlim(0, 1000) # Zoom in to relevant frequencies
plt.show()

### Explanation
`np.fft.rfft` computes the Discrete Fourier Transform for real analysis.
You should see a spike at 440 Hz (if using the generated sine wave), which corresponds to the note A4.

### Spectrogram
A spectrum shows frequencies for the *whole* duration. A **Spectrogram** shows how frequencies change *over time*.

In [None]:
plt.figure(figsize=(12, 6))
plt.specgram(data, Fs=samplerate, NFFT=1024, noverlap=512, cmap='inferno')
plt.title("Spectrogram")
plt.ylabel("Frequency (Hz)")
plt.xlabel("Time (s)")
plt.colorbar(label="Intensity (dB)")
plt.show()

### Explanation
`plt.specgram` computes and plots the spectrogram. Brighter colors mean stronger frequencies at that time.