# Introduction

This notebook demonstrates how to train custom openWakeWord models using pre-defined datasets and an automated process for dataset generation and training. While not guaranteed to always produce the best performing model, the methods shown in this notebook often produce baseline models with releatively strong performance.

Manual data preparation and model training (e.g., see the [training models](training_models.ipynb) notebook) remains an option for when full control over the model development process is needed.

At a high level, the automatic training process takes advantages of several techniques to try and produce a good model, including:

- Early-stopping and checkpoint averaging (similar to [stochastic weight averaging](https://arxiv.org/abs/1803.05407)) to search for the best models found during training, according to the validation data
- Variable learning rates with cosine decay and multiple cycles
- Adaptive batch construction to focus on only high-loss examples when the model begins to converge, combined with gradient accumulation to ensure that batch sizes are still large enough for stable training
- Cycical weight schedules for negative examples to help the model reduce false-positive rates

See the contents of the `train.py` file for more details.

# Environment Setup

To begin, we'll need to install the requirements for training custom models. In particular, a relatively recent version of Pytorch and custom fork of the [piper-sample-generator](https://github.com/dscripka/piper-sample-generator) library for generating synthetic examples for the custom model.

**Important Note!** Currently, automated model training is only supported on linux systems due to the requirements of the text to speech library used for synthetic sample generation (Piper). It may be possible to use Piper on Windows/Mac systems, but that has not (yet) been tested.

In [1]:
## Environment setup
!git clone https://github.com/rhasspy/piper-sample-generator
!wget -O piper-sample-generator/models/en_US-libritts_r-medium.pt 'https://github.com/rhasspy/piper-sample-generator/releases/download/v2.0.0/en_US-libritts_r-medium.pt'
!pip install piper-phonemize==1.1.0
!pip install webrtcvad==2.0.10

# install openwakeword (full installation to support training)
!git clone https://github.com/dscripka/openwakeword
!mkdir models
!wget -O models/pt_PT-rita.onnx 'https://github.com/Katilho/piper-sample-generator/releases/download/v0.1.0/pt_PT-rita.onnx'
!wget -O models/pt_PT-rita.onnx.json 'https://github.com/Katilho/piper-sample-generator/releases/download/v0.1.0/pt_PT-rita.onnx.json'
!wget -O models/voices.json 'https://huggingface.co/rhasspy/piper-voices/resolve/main/voices.json'
!pip install -e ./openwakeword
!cd openwakeword

# Download required models (workaround for Colab)
import os

!mkdir -p ./openwakeword/openwakeword/resources/models
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/embedding_model.onnx -O ./openwakeword/openwakeword/resources/models/embedding_model.onnx
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/embedding_model.tflite -O ./openwakeword/openwakeword/resources/models/embedding_model.tflite
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/melspectrogram.onnx -O ./openwakeword/openwakeword/resources/models/melspectrogram.onnx
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/melspectrogram.tflite -O ./openwakeword/openwakeword/resources/models/melspectrogram.tflite

# When augmenting the samples, it aparently was needed this in a different directory.
!mkdir -p ./src/openwakeword/openwakeword/resources/models
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/embedding_model.onnx \
  -O ./src/openwakeword/openwakeword/resources/models/embedding_model.onnx
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/embedding_model.tflite \
  -O ./src/openwakeword/openwakeword/resources/models/embedding_model.tflite
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/melspectrogram.onnx \
  -O ./src/openwakeword/openwakeword/resources/models/melspectrogram.onnx
!wget -nc https://github.com/dscripka/openWakeWord/releases/download/v0.5.1/melspectrogram.tflite \
  -O ./src/openwakeword/openwakeword/resources/models/melspectrogram.tflite

Cloning into 'piper-sample-generator'...
remote: Enumerating objects: 124, done.[K
remote: Counting objects: 100% (55/55), done.[K
remote: Compressing objects: 100% (21/21), done.[K
remote: Total 124 (delta 42), reused 34 (delta 34), pack-reused 69 (from 1)[K
Receiving objects: 100% (124/124), 1.03 MiB | 2.84 MiB/s, done.
Resolving deltas: 100% (52/52), done.
--2025-08-04 14:01:28--  https://github.com/rhasspy/piper-sample-generator/releases/download/v2.0.0/en_US-libritts_r-medium.pt
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://release-assets.githubusercontent.com/github-production-release-asset/642029941/73f4af3c-7cf8-4547-a7b9-3bd29e7f3c33?sp=r&sv=2018-11-09&sr=b&spr=https&se=2025-08-04T15%3A00%3A52Z&rscd=attachment%3B+filename%3Den_US-libritts_r-medium.pt&rsct=application%2Foctet-stream&skoid=96c2d410-5711-43a1-aedd-ab1947aa7ab0&sktid=398

In [2]:
# # install other dependencies
# !pip install mutagen==1.47.0
# !pip install torchinfo==1.8.0
# !pip install torchmetrics==1.2.0
# !pip install speechbrain==0.5.14
# !pip install audiomentations==0.33.0
# !pip install torch-audiomentations==0.11.0
# !pip install acoustics==0.2.6
# # !pip uninstall tensorflow -y
# # !pip install tensorflow-cpu==2.8.1
# # !pip install protobuf==3.20.3
# # !pip install tensorflow_probability==0.16.0
# # !pip install onnx_tf==1.10.0
# # # My old attempts to fix the conversion to .onnx, maybe unnecessary now
# # !pip install "tensorflow==2.15.0"
# # !pip install "tensorflow-addons==0.23.0"  # match TFA to TF version (check compatibility matrix if unsure)
# # !pip install onnx-tf
# # !pip install tensorflow-probability
# # !pip install protobuf
# # !pip install tf-keras tensorflow-probability[tf]
# # #
# !pip install onnx_tf==1.10.0
# !pip install onnx2tf==1.28.2
# !pip install onnx==1.18.0
# !pip install onnx_graphsurgeon==0.5.8
# !pip install sng4onnx==1.0.4
# !pip install pronouncing==0.2.0
# !pip install datasets==2.14.6
# !pip install deep-phonemizer==0.0.19
# # My additions
# !pip install onnxruntime-gpu==1.20.0 # 1.19 for cuda 11.x; 1.20 for cuda 12.x (the actual version of cuda is obtained from nvcc --version and NOT nvidia-smi)
# !pip install piper-tts==1.2.0
# !pip install webrtcvad==2.0.10


# !pip install datasets==2.14.6
# !pip install pyarrow==20.0.0

!wget https://raw.githubusercontent.com/pedromartinsdtx/oww-training/refs/heads/main/requirements.txt
!pip install -r requirements.txt

!pip install onnxruntime-gpu==1.19.2
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

--2025-08-04 14:02:12--  https://raw.githubusercontent.com/pedromartinsdtx/oww-training/refs/heads/main/requirements.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3634 (3.5K) [text/plain]
Saving to: ‘requirements.txt’


2025-08-04 14:02:13 (35.7 MB/s) - ‘requirements.txt’ saved [3634/3634]

Collecting protobuf==6.31.1 (from -r requirements.txt (line 124))
  Using cached protobuf-6.31.1-cp39-abi3-manylinux2014_x86_64.whl.metadata (593 bytes)
Collecting tensorboard==2.20.0 (from -r requirements.txt (line 146))
  Using cached tensorboard-2.20.0-py3-none-any.whl.metadata (1.8 kB)
Using cached protobuf-6.31.1-cp39-abi3-manylinux2014_x86_64.whl (321 kB)
Using cached tensorboard-2.20.0-py3-none-any.whl (5.5 MB)
Installing collected packages: protobuf, tensorb

In [3]:
# Imports

import os
import numpy as np
import torch
import sys
from pathlib import Path
import uuid
import yaml
import datasets
import scipy
from tqdm import tqdm


# Download Data

When training new openWakeWord models using the automated procedure, four specific types of data are required:

1) Synthetic examples of the target word/phrase generated with text-to-speech models

2) Synthetic examples of adversarial words/phrases generated with text-to-speech models

3) Room impulse reponses and noise/background audio data to augment the synthetic examples and make them more realistic

4) Generic "negative" audio data that is very unlikely to contain examples of the target word/phrase in the context where the model should detect it. This data can be the original audio data, or precomputed openWakeWord features ready for model training.

5) Validation data to use for early-stopping when training the model.

For the purposes of this notebook, all five of these sources will either be generated manually or can be obtained from HuggingFace thanks to their excellent `datasets` library and extremely generous hosting policy. Also note that while only a portion of some datasets are downloaded, for the best possible performance it is recommended to download the entire dataset and keep a local copy for future training runs.

In [4]:
import os
import time
import numpy as np
import scipy.io.wavfile
from tqdm import tqdm
from datasets import load_dataset
from huggingface_hub.utils import HfHubHTTPError

output_dir = "./mit_rirs"
if not os.path.exists(output_dir):
    os.mkdir(output_dir)

# Retry logic to handle rate limits
max_retries = 5
backoff = 5  # initial backoff in seconds

for attempt in range(max_retries):
    try:
        rir_dataset = load_dataset(
            "davidscripka/MIT_environmental_impulse_responses",
            split="train",
            streaming=True,
        )
        break  # exit retry loop if successful
    except HfHubHTTPError as e:
        if e.response.status_code == 429 and attempt < max_retries - 1:
            print(f"Rate limited (429). Retrying in {backoff} seconds...")
            time.sleep(backoff)
            backoff *= 2  # exponential backoff
        else:
            raise

# Save clips to 16-bit PCM wav files
for row in tqdm(rir_dataset):
    name = row["audio"]["path"].split("/")[-1]
    audio = (row["audio"]["array"] * 32767).astype(np.int16)
    scipy.io.wavfile.write(os.path.join(output_dir, name), 16000, audio)

Resolving data files:   0%|          | 0/270 [00:00<?, ?it/s]

270it [01:28,  3.04it/s]


In [5]:
## Download noise and background audio

# Audioset Dataset (https://research.google.com/audioset/dataset/index.html)
# Download one part of the audioset .tar files, extract, and convert to 16khz
# For full-scale training, it's recommended to download the entire dataset from
# https://huggingface.co/datasets/agkphysics/AudioSet, and
# even potentially combine it with other background noise datasets (e.g., FSD50k, Freesound, etc.)

if not os.path.exists("audioset"):
    os.mkdir("audioset")

fname = "bal_train09.tar"
out_dir = f"audioset/{fname}"
link = "https://huggingface.co/datasets/agkphysics/AudioSet/resolve/main/" + fname
!wget -O {out_dir} {link}
!cd audioset && tar -xvf bal_train09.tar

output_dir = "./audioset_16k"
if not os.path.exists(output_dir):
    os.mkdir(output_dir)

# Convert audioset files to 16khz sample rate
audioset_dataset = datasets.Dataset.from_dict({"audio": [str(i) for i in Path("audioset/audio").glob("**/*.flac")]})
audioset_dataset = audioset_dataset.cast_column("audio", datasets.Audio(sampling_rate=16000))
for row in tqdm(audioset_dataset):
    name = row['audio']['path'].split('/')[-1].replace(".flac", ".wav")
    scipy.io.wavfile.write(os.path.join(output_dir, name), 16000, (row['audio']['array']*32767).astype(np.int16))

# Free Music Archive dataset (https://github.com/mdeff/fma)
output_dir = "./fma"
if not os.path.exists(output_dir):
    os.mkdir(output_dir)
fma_dataset = datasets.load_dataset("rudraml/fma", name="small", split="train", streaming=True)
fma_dataset = iter(fma_dataset.cast_column("audio", datasets.Audio(sampling_rate=16000)))

n_hours = 3  # use only 1 hour of clips for this example notebook, recommend increasing for full-scale training
for i in tqdm(range(n_hours*3600//30)):  # this works because the FMA dataset is all 30 second clips
    row = next(fma_dataset)
    name = row['audio']['path'].split('/')[-1].replace(".mp3", ".wav")
    scipy.io.wavfile.write(os.path.join(output_dir, name), 16000, (row['audio']['array']*32767).astype(np.int16))
    i += 1
    if i == n_hours*3600//30:
        break


--2025-08-04 14:04:01--  https://huggingface.co/datasets/agkphysics/AudioSet/resolve/main/bal_train09.tar
Resolving huggingface.co (huggingface.co)... 3.164.240.43, 3.164.240.18, 3.164.240.38, ...
Connecting to huggingface.co (huggingface.co)|3.164.240.43|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2025-08-04 14:04:02 ERROR 404: Not Found.

tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors


0it [00:00, ?it/s]
100%|█████████▉| 359/360 [02:39<00:00,  2.25it/s]


In [6]:
# Download pre-computed openWakeWord features for training and validation

# training set (~2,000 hours from the ACAV100M Dataset)
# See https://huggingface.co/datasets/davidscripka/openwakeword_features for more information
!wget https://huggingface.co/datasets/davidscripka/openwakeword_features/resolve/main/openwakeword_features_ACAV100M_2000_hrs_16bit.npy

# validation set for false positive rate estimation (~11 hours)
!wget https://huggingface.co/datasets/davidscripka/openwakeword_features/resolve/main/validation_set_features.npy

--2025-08-04 14:07:22--  https://huggingface.co/datasets/davidscripka/openwakeword_features/resolve/main/openwakeword_features_ACAV100M_2000_hrs_16bit.npy
Resolving huggingface.co (huggingface.co)... 3.164.240.38, 3.164.240.65, 3.164.240.43, ...
Connecting to huggingface.co (huggingface.co)|3.164.240.38|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cas-bridge.xethub.hf.co/xet-bridge-us/64f3a0b6918ffcc15af6923c/7e1cade4c3fda6a5081158383c8d43c4a3e1e42555150b596b373efddf9b5194?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Credential=cas%2F20250804%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250804T140722Z&X-Amz-Expires=3600&X-Amz-Signature=fce2159baec8b284b3eedcf8b5c867642cbc08c56ca20fafc3ede536f034f262&X-Amz-SignedHeaders=host&X-Xet-Cas-Uid=public&response-content-disposition=inline%3B+filename*%3DUTF-8%27%27openwakeword_features_ACAV100M_2000_hrs_16bit.npy%3B+filename%3D%22openwakeword_features_ACAV100M_2000_hrs_16

# Define Training Configuration

For automated model training openWakeWord uses a specially designed training script and a [YAML](https://yaml.org/) configuration file that defines all of the information required for training a new wake word/phrase detection model.

It is strongly recommended that you review the example config file `openwakeword/examples/custom_model.yml`, as each value is fully documented there.

---
*TODO: Remove this text to reflect the actual values.*

For the purposes of this notebook, we'll read in the YAML file to modify certain configuration parameters before saving a new YAML file for training our example model. Specifically:

- We'll train a detection model for the phrase "hey sebastian"
- We'll only generate 5,000 positive and negative examples (to save on time for this example)
- We'll only generate 1,000 validation positive and negative examples for early stopping (again to save time)
- The model will only be trained for 10,000 steps (larger datasets will benefit from longer training)
- We'll reduce the target metrics to account for the small dataset size and limited training.

On the topic of target metrics, there are *not* specific guidelines about what these metrics should be in practice, and you will need to conduct testing in your target deployment environment to establish good thresholds. However, from very limited testing the default values in the config file (accuracy >= 0.7, recall >= 0.5, false-positive rate <= 0.2 per hour) seem to produce models with reasonable performance.


In [7]:
# Load default YAML config file for training
config = yaml.load(open("openwakeword/examples/custom_model.yml", 'r').read(), yaml.Loader)

In [8]:
# Modify values in the config and save a new version

config["target_phrase"] = [
    # "Clãriss",
    # "Clãriss?",
    # "Hey Clãriss",
    "Hey_Cledeess",
    # "Ólá Clãriss",
]
config["model_name"] = config["target_phrase"][0].replace(" ", "_")

config["n_samples"] = 35000  # For optimal values 100.000
config["n_samples_val"] = 2000 # Default: 1000
config["steps"] = 30000
config["max_negative_weight"] = 3000

config["target_accuracy"] = 0.7
config["target_recall"] = 0.5
config["target_false_positives_per_hour"] = 0.2  # Default: 0.2

config['tts_batch_size'] = 50 # Default: 50
config['augmentation_batch_size'] = 16 # Default: 16

config["output_dir"] = "final_result"

config["background_paths"] = [
    "./audioset_16k",
    "./fma",
]  # multiple background datasets are supported
config["false_positive_validation_data_path"] = "validation_set_features.npy"
config["feature_data_files"] = {
    "ACAV100M_sample": "openwakeword_features_ACAV100M_2000_hrs_16bit.npy"
}

with open("my_model.yaml", "w") as file:
    documents = yaml.dump(config, file)

config

{'model_name': 'Hey_Cledeess',
 'target_phrase': ['Hey_Cledeess'],
 'custom_negative_phrases': [],
 'n_samples': 35000,
 'n_samples_val': 2000,
 'tts_batch_size': 50,
 'augmentation_batch_size': 16,
 'piper_sample_generator_path': './piper-sample-generator',
 'output_dir': 'final_result',
 'rir_paths': ['./mit_rirs'],
 'background_paths': ['./audioset_16k', './fma'],
 'background_paths_duplication_rate': [1],
 'false_positive_validation_data_path': 'validation_set_features.npy',
 'augmentation_rounds': 1,
 'feature_data_files': {'ACAV100M_sample': 'openwakeword_features_ACAV100M_2000_hrs_16bit.npy'},
 'batch_n_per_class': {'ACAV100M_sample': 1024,
  'adversarial_negative': 50,
  'positive': 50},
 'model_type': 'dnn',
 'layer_size': 32,
 'steps': 30000,
 'max_negative_weight': 3000,
 'target_false_positives_per_hour': 0.2,
 'target_accuracy': 0.7,
 'target_recall': 0.5}

# Train the Model

With the data downloaded and training configuration set, we can now start training the model. We'll do this in parts to better illustrate the sequence, but you can also execute every step at once for a fully automated process.

In [9]:
# Step 1: Generate synthetic clips
# For the number of clips we are using, this should take ~10 minutes on a free Google Colab instance with a T4 GPU
# If generation fails, you can simply run this command again as it will continue generating until the
# number of files meets the targets specified in the config file

!{sys.executable} openwakeword/openwakeword/train.py --training_config my_model.yaml --generate_clips

  torchaudio.set_audio_backend("soundfile")
INFO:root:##################################################
Generating positive clips for training
##################################################
DEBUG:generate_samples:Loading /workspace/standard/piper-sample-generator/models/en_US-libritts_r-medium.pt
  torch_model = torch.load(model_path)
INFO:generate_samples:Successfully loaded the model
DEBUG:generate_samples:CUDA available, using GPU
DEBUG:generate_samples:Batch 1/700 complete
DEBUG:generate_samples:Batch 2/700 complete
DEBUG:generate_samples:Batch 3/700 complete
DEBUG:generate_samples:Batch 4/700 complete
DEBUG:generate_samples:Batch 5/700 complete
DEBUG:generate_samples:Batch 6/700 complete
DEBUG:generate_samples:Batch 7/700 complete
DEBUG:generate_samples:Batch 8/700 complete
DEBUG:generate_samples:Batch 9/700 complete
DEBUG:generate_samples:Batch 10/700 complete
DEBUG:generate_samples:Batch 11/700 complete
DEBUG:generate_samples:Batch 12/700 complete
DEBUG:generate_samples:Bat

In [10]:
# Step 2: Augment the generated clips

!{sys.executable} openwakeword/openwakeword/train.py --training_config my_model.yaml --augment_clips

  torchaudio.set_audio_backend("soundfile")
INFO:root:##################################################
Computing openwakeword features for generated samples
##################################################
  >>> augment = PitchShift(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
  >>> augment = BandStopFilter(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
  >>> augment = AddColoredNoise(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
  >>> augment = AddBackgroundNoise(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
  >>> augment = Gain(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
  >>> augment = Compose(..., output_type='dict')
  >>> augmented_samples = augment(samples).samples
Computing features: 100%|███████████████████| 2187/2187 [04:38<00:00,  7.85it/s]
Trimming empty rows: 35it [00:00, 94.71it/s]                                    
Co

In [None]:
# Step 3: Train model

!{sys.executable} openwakeword/openwakeword/train.py --training_config my_model.yaml --train_model

  torchaudio.set_audio_backend("soundfile")
INFO:root:##################################################
Starting training sequence 1...
##################################################
Training: 100%|██████████████████████████▉| 29999/30000 [07:44<00:00, 64.53it/s]
INFO:root:##################################################
Starting training sequence 2...
##################################################
INFO:root:Increasing weight on negative examples to reduce false positives...
Training:   5%|█▍                          | 155/3000.0 [00:16<00:37, 76.76it/s]

In [None]:
!pip install tensorflow tf_keras ai_edge_litert onnxsim

#!pip install \
#   ai-edge-litert==1.2.0 \
#   tensorflow==2.19.0 \
#   tensorflow-addons==0.23.0 \
#   tensorflow-io-gcs-filesystem==0.37.1


In [None]:
# Step 4 (Optional): On Google Colab, sometimes the .tflite model isn't saved correctly
# If so, run this cell to retry

# # Manually save to tflite as this doesn't work right in colab (broken in python 3.11, default in Colab as of January 2025)
# def convert_onnx_to_tflite(onnx_model_path, output_path):
#     """Converts an ONNX version of an openwakeword model to the Tensorflow tflite format."""
#     # imports
#     import onnx
#     import logging
#     import tempfile
#     from onnx_tf.backend import prepare
#     import tensorflow as tf

#     # Convert to tflite from onnx model
#     onnx_model = onnx.load(onnx_model_path)
#     tf_rep = prepare(onnx_model, device="CPU")
#     with tempfile.TemporaryDirectory() as tmp_dir:
#         tf_rep.export_graph(os.path.join(tmp_dir, "tf_model"))
#         converter = tf.lite.TFLiteConverter.from_saved_model(os.path.join(tmp_dir, "tf_model"))
#         tflite_model = converter.convert()

#         logging.info(f"####\nSaving tflite mode to '{output_path}'")
#         with open(output_path, 'wb') as f:
#             f.write(tflite_model)

#     return None

# convert_onnx_to_tflite(f"{config['output_dir']}/{config['model_name']}.onnx", f"{config['output_dir']}/{config['model_name']}.tflite")

# Convert ONNX model to tflite using `onnx2tf` library (works for python 3.11 as of January 2025)
onnx_model_path = f"{config['output_dir']}/{config['model_name']}.onnx"
name1, name2 = f"{config['output_dir']}/{config['model_name']}_float32.tflite", f"{config['output_dir']}/{config['model_name']}.tflite"
!onnx2tf -i {onnx_model_path} -o {config["output_dir"]}/ -kat onnx____Flatten_0
!mv {name1} {name2}

# Automatically download the trained model files
try:
    import google.colab
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    from google.colab import files
    files.download(f"{config['output_dir']}/{config['model_name']}.onnx")
    files.download(f"{config['output_dir']}/{config['model_name']}.tflite")

After the model finishes training, the auto training script will automatically convert it to ONNX and tflite versions, saving them as `my_custom_model/<model_name>.onnx/tflite` in the present working directory, where `<model_name>` is defined in the YAML training config file. Either version can be used as normal with `openwakeword`. I recommend testing them with the [`detect_from_microphone.py`](https://github.com/dscripka/openWakeWord/blob/main/examples/detect_from_microphone.py) example script to see how the model performs!