# 2. Fast Fourier Transform (FFT)

Speech consists of multiple frequencies produced by
vocal cords and shaped by the vocal tract.

FFT helps us analyze **which frequencies** are present
in the signal.


## Why Frequency Domain?

- Human speech is not a single tone
- Different phonemes correspond to different frequency patterns
- Frequency-domain analysis reveals hidden structure


In [1]:
# --- Notebook setup (shared across all notebooks) ---
import sys
import os

PROJECT_ROOT = os.path.abspath("..")
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)

import matplotlib
matplotlib.use("Agg")  # headless-safe for script execution

import matplotlib.pyplot as plt

from src.load_audio import load_audio
from src.graph_utils import save_graph


In [2]:
import numpy as np
from src.fft_from_scratch import compute_fft

signal, sr = load_audio("sample.ogg")

fft_magnitude = compute_fft(signal)
freqs = np.fft.fftfreq(len(fft_magnitude), d=1/sr)

plt.figure(figsize=(12, 4))
plt.plot(freqs[:len(freqs)//2], fft_magnitude[:len(fft_magnitude)//2])
plt.title("FFT Magnitude Spectrum")
plt.xlabel("Frequency (Hz)")
plt.ylabel("Magnitude")
plt.tight_layout()

save_graph("02_fft.png")
plt.close()


## Interpretation

- Peaks correspond to dominant frequencies
- FFT loses time information
- Speech is non-stationary → we need time–frequency analysis
