Keyword Spotting on Arm Cortex-M boards.
The first step in deploying the trained keyword spotting models on microcontrollers is quantization, which is described here. This directory consists of example codes and steps for running a quantized DNN model on any Cortex-M board using mbed-cli and CMSIS-NN library. It also consists of an example of integration of the KWS model onto a Cortex-M development board with an on-board microphone to demonstrate keyword spotting on live audio data.
Get the CMSIS-NN library and install mbed-cli
Clone CMSIS-5 library, which consists of the optimized neural network kernels for Cortex-M.
cd Deployment git clone https://github.com/ARM-software/CMSIS_5.git
Install mbed-cli and its python dependencies.
pip install mbed-cli
Build and run a simple KWS inference
In this example, the KWS inference is run on the audio data provided through a .h file. First create a new project and install any python dependencies prompted when project is created for the first time after the installation of mbed-cli.
mbed new kws_simple_test --mbedlib
Fetch the required mbed libraries for compilation.
cd kws_simple_test mbed deploy
Compile the code for the mbed board (for example NUCLEO_F411RE).
mbed compile -m NUCLEO_F411RE -t GCC_ARM --source . \ --source ../Source/KWS --source ../Source/NN --source ../Source/MFCC \ --source ../Source/local_NN --source ../Examples/simple_test \ --source ../CMSIS_5/CMSIS/NN/Include --source ../CMSIS_5/CMSIS/NN/Source \ --source ../CMSIS_5/CMSIS/DSP/Include --source ../CMSIS_5/CMSIS/DSP/Source \ --source ../CMSIS_5/CMSIS/Core/Include \ --profile ../release_O3.json -j 8
Copy the binary (.bin) to the board (Make sure the board is detected and mounted). Open a serial terminal (e.g. putty or minicom) and see the final classification output on screen.
cp ./BUILD/NUCLEO_F411RE/GCC_ARM/kws_simple_test.bin /media/<user>/NODE_F411RE/ sudo minicom
This example runs keyword spotting inference on live audio captured using the on-board microphones on the STM32F746NG discovery kit. When performing keyword spotting on live audio data with multiple noise sources, outputs are typically averaged over a specified window to generate smooth predictions. The averaging window length and the detection threshold (which may also be different for each keyword) are two key parameters in determining the overall keyword spotting accuracy and user experience.
mbed new kws_realtime_test --create-only cd kws_realtime_test cp ../Examples/realtime_test/mbed_libs/*.lib . mbed deploy mbed compile -m DISCO_F746NG -t GCC_ARM \ --source . --source ../Source --source ../Examples/realtime_test \ --source ../CMSIS_5/CMSIS/NN/Include --source ../CMSIS_5/CMSIS/NN/Source \ --source ../CMSIS_5/CMSIS/DSP/Include --source ../CMSIS_5/CMSIS/DSP/Source \ --source ../CMSIS_5/CMSIS/Core/Include \ --profile ../release_O3.json -j 8 cp ./BUILD/DISCO_F746NG/GCC_ARM/kws_realtime_test.bin /media/<user>/DIS_F746NG/
FRDM-K64F using gcc and makeBuild an example on
To build this example, clone CMSIS_5 repository and then
make. This example is created by exporting a simple hello-world example from mbed online compiler and editing the Makefile to incorporate the source files required for the keyword spotting example.
cd Deployment # Clone CMSIS_5 repository (if not done already) git clone https://github.com/ARM-software/CMSIS_5.git cd Examples/simple_test_k64f_gcc make -j 8 # copy binary to the device cp ./BUILD/simple_test_k64f_gcc.bin /media/<user>/DAPLINK/
Note: The examples provided use floating point operations for MFCC feature extraction, but it should be possible to convert them to fixed-point operations for deploying on microcontrollers that do not have dedicated floating point units.