# Train "Hey Nap" Wake Word Model

This notebook trains a custom OpenWakeWord model for "Hey Nap" and exports it as TFLite for ESP32 deployment.

**Wake Word:** `hey nap`

**Output:** `hey_nap.tflite` (ready for ESP32-S3 Korvo1)

## Step 1: Install Dependencies

In [None]:
# Install required packages
!pip install -q openwakeword torch torchvision torchaudio numpy scipy scikit-learn pyyaml tqdm
!pip install -q onnx onnx-tf tensorflow

## Step 2: Configure Wake Word

**Edit the wake word below:**

In [None]:
# Configure your wake word here
WAKE_WORD = "hey nap"  # Change this to your desired wake word

print(f"Training model for wake word: '{WAKE_WORD}'")
print(f"Model name will be: {WAKE_WORD.replace(' ', '_')}")

## Step 3: Import Libraries

In [None]:
import os
import sys
import torch
import numpy as np
from pathlib import Path
import openwakeword
from openwakeword.utils import download_models

# Check GPU availability
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"Using device: {device}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

# Download OpenWakeWord models (for feature extraction)
print("\nDownloading OpenWakeWord feature models...")
download_models(['hey_jarvis'], inference_framework='onnx')  # Just need one for features

## Step 4: Generate Synthetic Training Data

This uses TTS to generate ~100,000 training samples of your wake word.

In [None]:
# Install and setup Piper TTS for synthetic data generation
!pip install -q piper-tts

# Create directories
MODEL_NAME = WAKE_WORD.replace(' ', '_')
OUTPUT_DIR = f"./models/{MODEL_NAME}"
os.makedirs(f"{OUTPUT_DIR}/positive_train", exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/positive_test", exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/negative_train", exist_ok=True)
os.makedirs(f"{OUTPUT_DIR}/negative_test", exist_ok=True)

print(f"Output directory: {OUTPUT_DIR}")

In [None]:
# Generate positive training samples
# Note: This is a simplified version. Full training uses more sophisticated TTS
# For production, use the full OpenWakeWord training pipeline

print(f"Generating positive samples for '{WAKE_WORD}'...")
print("\n‚ö†Ô∏è  Note: Full training requires the complete OpenWakeWord training pipeline.")
print("This notebook provides a simplified approach.")
print("\nFor best results, use the official OpenWakeWord Colab notebook:")
print("https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb")

# We'll use a workaround: generate samples using gTTS or similar
try:
    from gtts import gTTS
    import subprocess
    
    # Generate a few samples using gTTS (limited but works)
    print("\nGenerating sample audio files...")
    for i in range(10):
        tts = gTTS(text=WAKE_WORD, lang='en', slow=False)
        tts.save(f"{OUTPUT_DIR}/positive_train/sample_{i:04d}.mp3")
        # Convert to WAV using ffmpeg
        !ffmpeg -i {OUTPUT_DIR}/positive_train/sample_{i:04d}.mp3 -ar 16000 -ac 1 -sample_fmt s16 {OUTPUT_DIR}/positive_train/sample_{i:04d}.wav -y
        os.remove(f"{OUTPUT_DIR}/positive_train/sample_{i:04d}.mp3")
    print("‚úÖ Generated sample audio files")
except Exception as e:
    print(f"‚ö†Ô∏è  Could not generate samples: {e}")
    print("\nüí° For full training, use the official OpenWakeWord Colab notebook")

## Alternative: Use Official OpenWakeWord Training Notebook

The full training pipeline is complex. For best results, use the official notebook:

**üîó https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb**

In that notebook:
1. Set wake word to: `"hey nap"`
2. Run all cells
3. Download the TFLite model

This notebook provides a simplified setup, but the official one has:
- Full TTS pipeline (Piper)
- Proper data augmentation
- Complete training pipeline
- Automatic TFLite export

## Instructions for Official Notebook

1. **Open the notebook:** https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb

2. **Find the wake word configuration cell** and set:
   ```python
   wake_word = "hey nap"  # or ["hey nap"]
   ```

3. **Run all cells** (Runtime ‚Üí Run all)
   - This will take ~1 hour
   - Most of it is automated

4. **Download the model:**
   - Look for the download link at the end
   - Download `hey_nap.tflite`

5. **Test locally:**
   ```bash
   python3 test_hey_nap_local.py
   ```

6. **Deploy to ESP32:**
   - Copy to `components/openwakeword/models/hey_nap.tflite`
   - Update ESP32 code to load the model