# 3. Spectrogram (Time–Frequency Representation)

Speech changes over time.
A single FFT cannot capture this behavior.

Spectrograms solve this by applying FFT
to short overlapping windows of audio.


## Short-Time Fourier Transform (STFT)

- Audio is split into small frames
- FFT is applied to each frame
- Produces a 2D representation:
  time × frequency


In [1]:
# --- Notebook setup (shared across all notebooks) ---
import sys
import os

PROJECT_ROOT = os.path.abspath("..")
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)

import matplotlib
matplotlib.use("Agg")  # headless-safe for script execution

import matplotlib.pyplot as plt

from src.load_audio import load_audio
from src.graph_utils import save_graph


In [2]:
import librosa.display
from src.spectrogram import compute_spectrogram

signal, sr = load_audio("sample.ogg")

spec_db = compute_spectrogram(signal, sr)

plt.figure(figsize=(12, 5))
librosa.display.specshow(
    spec_db,
    sr=sr,
    x_axis="time",
    y_axis="hz"
)
plt.colorbar(label="dB")
plt.title("Spectrogram")
plt.tight_layout()

save_graph("03_spectrogram.png")
plt.close()


## Why Spectrograms Are Important

- Preserve time information
- Reveal phoneme transitions
- Serve as input to CNN-based speech models
