# 02: Feature Inspection

This notebook loads and visualizes the features generated by `src/feature_extraction.py` and stored in `data/features/`.

We will:
1.  Load a sample feature file (`.npy`).
2.  Visualize the **Mel Spectrogram**.
3.  Visualize the **Pitch Contour (F0)**.
4.  Visualize the **MFCCs**.

This confirms the features are correctly calculated and stored before we use them for training.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import librosa
import librosa.display
from pathlib import Path
import seaborn as sns

# --- Configuration ---
sns.set_style("whitegrid")
FEATURES_DIR = Path("../data/features")
REPORTS_PLOTS_DIR = Path("../reports/plots")
REPORTS_PLOTS_DIR.mkdir(parents=True, exist_ok=True)

# Assuming a sample rate of 16000 Hz and hop length of 256 for time-axis display
SR = 16000
HOP_LENGTH = 256

## 1. Load Sample Feature File

Let's find the first available `.npy` file in the features directory and load it. We assume it's a dictionary containing all feature types.

In [None]:
def load_sample_features(base_dir: Path):
    """Finds and loads the first .npy file from the directory."""
    sample_path = next(base_dir.rglob("*.npy"), None)
    if sample_path:
        print(f"Loading sample: {sample_path.name}")
        # allow_pickle=True is needed if the .npy file contains a Python dictionary
        features = np.load(sample_path, allow_pickle=True).item()
        return features, sample_path.name
    else:
        print(f"Error: No feature files found in {base_dir}")
        return None, None

features, sample_name = load_sample_features(FEATURES_DIR)

if features:
    print("\nLoaded feature keys:")
    print(list(features.keys()))

## 2. Visualize Mel Spectrogram

This is the primary input to the model's encoder. We expect to see clear speech formants.

In [None]:
if features and 'mel_spectrogram' in features:
    mel_spec = features['mel_spectrogram']
    print(f"Mel Spectrogram shape: {mel_spec.shape}") # (n_mels, n_frames)
    
    plt.figure(figsize=(15, 4))
    # We display the spectrogram in dB
    S_db = librosa.power_to_db(mel_spec, ref=np.max)
    librosa.display.specshow(S_db, sr=SR, hop_length=HOP_LENGTH, x_axis='time', y_axis='mel')
    plt.colorbar(format='%+2.0f dB')
    plt.title(f"Mel Spectrogram - {sample_name}")
    plt.xlabel("Time (s)")
    plt.ylabel("Frequency (Hz)")
    plt.tight_layout()
    plt.savefig(REPORTS_PLOTS_DIR / "02_sample_melspectrogram.png")
    plt.show()
else:
    print("Key 'mel_spectrogram' not found in feature file.")

## 3. Visualize Pitch Contour (F0)

This visualizes the fundamental frequency (pitch) of the speaker over time. We expect to see values of 0 during unvoiced segments (silence, 's', 'f' sounds).

In [None]:
if features and 'pitch_contour' in features:
    f0 = features['pitch_contour']
    print(f"Pitch Contour shape: {f0.shape}") # (n_frames,)

    plt.figure(figsize=(15, 4))
    # Get time axis ticks
    times = librosa.times_like(f0, sr=SR, hop_length=HOP_LENGTH)
    
    # Set 0 values (unvoiced) to NaN so they don't plot
    f0[f0 == 0] = np.nan
    
    plt.plot(times, f0, 'o', markersize=2, label='F0 (Hz)')
    plt.title(f"Pitch Contour (F0) - {sample_name}")
    plt.xlabel("Time (s)")
    plt.ylabel("Frequency (Hz)")
    plt.legend()
    plt.tight_layout()
    plt.savefig(REPORTS_PLOTS_DIR / "02_sample_pitch_contour.png")
    plt.show()
else:
    print("Key 'pitch_contour' not found in feature file.")

## 4. Visualize MFCCs

MFCCs (Mel-Frequency Cepstral Coefficients) are another common feature, often used for speaker and accent identification. They capture the timbral quality of the voice.

In [None]:
if features and 'mfcc' in features:
    mfccs = features['mfcc']
    print(f"MFCC shape: {mfccs.shape}") # (n_mfcc, n_frames)
    
    plt.figure(figsize=(15, 4))
    librosa.display.specshow(mfccs, sr=SR, hop_length=HOP_LENGTH, x_axis='time')
    plt.colorbar(label='Coefficient Value')
    plt.title(f"MFCCs - {sample_name}")
    plt.xlabel("Time (s)")
    plt.ylabel("MFCC Coefficient")
    plt.tight_layout()
    plt.savefig(REPORTS_PLOTS_DIR / "02_sample_mfccs.png")
    plt.show()
else:
    print("Key 'mfcc' not found in feature file.")

## 5. Initial Findings

The feature files in `data/features/` appear to be structured correctly as Python dictionaries.

* **Mel Spectrogram:** The visualization shows clear harmonic structures, indicating the feature is valid.
* **Pitch Contour:** The F0 plot correctly shows voiced segments and unvoiced gaps (where F0=0).
* **MFCCs:** The cepstral coefficients are loaded correctly.

All features seem to be aligned in time (i.e., they have the same number of frames) and are ready for the training pipeline.