First public release of the VoxRT wake-word model — detects the phrase "Hey Assistant" on the VoxRT custom on-device inference runtime.
Quality
Held-out test split: 5,240 positive utterances + 6,416 hard-negative utterances
(isolated "Hey", isolated "Assistant", competitor wake-words like "Hey Siri",
phonetic neighbours, arbitrary speech, non-speech audio). Speakers disjoint
from train + val.
- ROC AUC: 0.9966
- Average precision (PR AUC): 0.9899
At the recommended deploy threshold of 0.9:
| Metric | Test value |
|---|---|
| Precision | 0.993 |
| Recall | 0.982 |
| F1 | 0.987 |
| FPR | 0.5 % |
Architecture + footprint
- 8-block depthwise-separable Conv1D, dilations
[1, 2, 4, 4, 4, 2, 2, 1], 64 channels - ~48 K parameters, fp16 weights, AES-256-GCM at-rest encryption
- ~100 KB on disk (the .vxrt artefact below)
- 64-bin Slaney-norm mel frontend, 16 kHz mono PCM input, 200-frame (2 s) sliding window
Runtime performance
arm64-v8a release builds, post-warmup, RTF = wall-time-per-frame ÷ frame audio duration:
| Device | SoC | RTF |
|---|---|---|
| Xiaomi Redmi 9C (Cortex-A73 pin) | SD 662 (midrange 2020) | 0.021 |
| iPhone 13 Pro Max | Apple A15 Bionic | 0.015 |
≈ 50–65× faster than realtime on phone-class hardware — well within an always-on power budget.
Install
Pair with one of the consumer libraries:
- Android —
voxrt-wake-word-android v0.1.0 - iOS —
voxrt-wake-word-ios v0.1.0
Or download directly + verify:
curl -L -o voxrt_wake_word.vxrt \
https://github.com/VoxRT/voxrt-wake-word-models/releases/download/v0.1.0/voxrt_wake_word.vxrt
echo "9d40bdc132a2ad8e85bd8a28bb49b77c51a7c62f60567222a037e44418510e8f voxrt_wake_word.vxrt" | shasum -a 256 -c
License
VoxRT proprietary — redistribution as part of the unmodified voxrt-wake-word-{android,ios} libraries is permitted for commercial apps without
per-installation fees. See LICENSE for full terms. Custom phrases / multi-phrase detection / language extension via the commercial VoxRT SDK — contact
help@voxrt.com.