Skip to content

Latest commit

 

History

History
104 lines (69 loc) · 4.97 KB

README.rst

File metadata and controls

104 lines (69 loc) · 4.97 KB

WakeNet Wake Word Model

:link_to_translation:`zh_CN:[中文]`

WakeNet is a wake word engine built upon neural network for low-power embedded MCUs. Currently, WakeNet supports up to 5 wake words.

Overview

Please see the flow diagram of WakeNet below:

overview
  • Speech Feature
    We use MFCC method to extract the speech spectrum features. The input audio file has a sample rate of 16KHz, mono, and is encoded as signed 16-bit. Each frame has a window width and step size of 30ms.
.. only:: latex

    .. figure:: ../../_static/QR_MFCC.png
        :alt: overview

  • Neural Network
    Now, the neural network structure has been updated to the ninth edition, among which:
    • WakeNet1, WakeNet2, WakeNet3, WakeNet4, WakeNet6, and WakeNet7 had been out of use.
    • WakeNet5 only supports ESP32 chip.
    • WakeNet8 and WakeNet9 only support ESP32-S3 chip, which are built upon the Dilated Convolution structure.
.. only:: latex

    .. figure:: ../../_static/QR_Dilated_Convolution.png
        :alt: overview

    The network structure of WakeNet5, WakeNet5X2 and WakeNet5X3 is the same, but WakeNetX2 and WakeNetX3 have more parameters than WakeNet5. Please refer to :doc:`Resource Consumption <../benchmark/README>` for details.

  • Keyword Triggering Method:
    For continuous audio stream, we calculate the average recognition results (M) for several frames and generate a smoothing prediction result, to improve the accuracy of keyword triggering. Only when the M value is larger than the set threshold, a triggering command is sent.

The wake words supported by Espressif chips are listed below:

Chip ESP32 ESP32S3
model WakeNet 5 WakeNet 8 WakeNet 9
WakeNet 5 WakeNet 5X2 WakeNet 5X3 Q16 Q8 Q16 Q8
Hi,Lexin      
nihaoxiaozhi        
nihaoxiaoxin            
xiaoaitongxue            
Alexa          
Hi,ESP            
Customized word            

Use WakeNet

Resource Occupancy

For the resource occupancy for this model, see :doc:`Resource Occupancy <../benchmark/README>`.