# Keyword Spotting - Pac-Man

This tutorial describes how to use the MLTK to develop a "Pac-Man" keyword spotting demo.

The basic setup for this demo is as follows:  
![](https://github.com/SiliconLabs/mltk/raw/master/docs/img/pacman_demo_overview.png)

In the demo, embedded machine learning is used to detect the keywords:
- __Left__
- __Right__
- __Up__
- __Down__
- __Stop__
- __Go__

When a keyword is detected, its corresponding ID is sent to a webpage via Bluetooth Low-Energy. The webpage uses Javascript to process keyword ID to move the Pac-Man accordingly.

## Live Demo

A live demo for this tutorial is available online:  
[https://mltk-pacman.web.app](https://mltk-pacman.web.app)


__NOTE:__ To use this demo, you must have a [BRD2601](https://siliconlabs.github.io/mltk/docs/other/supported_hardware.html#brd2601) development board.

## Quick Links

- [GitHub Source](https://github.com/SiliconLabs/mltk/blob/master/mltk/tutorials/keyword_spotting_pacman.ipynb) - View this tutorial on Github
- [Run on Colab](https://colab.research.google.com/github/siliconlabs/mltk/blob/master/mltk/tutorials/keyword_spotting_pacman.ipynb) - Run this tutorial on Google Colab
- [Train in the "Cloud"](https://siliconlabs.github.io/mltk/mltk/tutorials/cloud_training_with_vast_ai.html) - _Vastly_ improve training times by training this model in the "cloud"
- [C++ Example Application](https://siliconlabs.github.io/mltk/docs/cpp_development/examples/ble_audio_classifier.html) - View this tutorial's associated C++ example application
- [Pac-Man Webpage Source](https://github.com/SiliconLabs/mltk/blob/master/cpp/shared/apps/ble_audio_classifier/web/pacman) - View the Pac-Man webpage's source code on Github
- [Machine Learning Model](https://siliconlabs.github.io/mltk/docs/python_api/models/siliconlabs/keyword_spotting_pacman_v3.html) - View this tutorial's associated machine learning model
- [Live Demo](https://mltk-pacman.web.app) - Play Pac-Man using the keywords: Left, Right, Up, Down 
- [Presentation PDF](https://cms.tinyml.org/wp-content/uploads/talks2022/tinyML_Talks_Dan_Riedler_221025.pdf) - Presentation describing how this demo was created
- [Presentation Video](https://www.youtube.com/watch?v=xhiFMDOyA0g) - YouTube video of the presentation given to TinyML.org for this tutorial

## Overview

### Objectives

After completing this tutorial, you will have:
1. A better understanding of how audio classification machine learning models work
2. All of the tools needed to develop your own keyword spotting model
3. A better understanding of how to issue commands to a webpage from an embedded MCU via Bluetooth Low Energy
4. A working demo to play the game Pac-Man using the keywords: "Left", "Right", "Up", "Down", "Stop", "Go"

### Content

This tutorial is divided into the following sections:
- [Prerequisite reading](#prerequisite-reading)
- [Creating the machine learning model](#creating-the-machine-learning-model)
- [Creating the firmware application](#creating-the-firmware-application)
- [Creating the Pac-Man webpage](#creating-the-pac-man-webpage)
- [Running the demo](#running-the-demo)

### Running this tutorial from a notebook

For documentation purposes, this tutorial was designed to run within a [Jupyter Notebook](https://jupyter.org). 
The notebook can either run locally on your PC _or_ on a remote server like [Google Colab](https://colab.research.google.com/notebooks/welcome.ipynb).  

- Refer to the [Notebook Examples Guide](https://siliconlabs.github.io/mltk/docs/guides/notebook_examples_guide.html) for more details
- Click here: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/siliconlabs/mltk/blob/master/mltk/tutorials/keyword_spotting_pacman.ipynb) to run this tutorial interactively in your browser

__NOTE:__ Some of the following sections require this tutorial to be running locally with a supported embedded platform connected.

### Running this tutorial from the command-line

While this tutorial uses a [Jupyter Notebook](https://jupyter.org), 
the recommended approach is to use your favorite text editor and standard command terminal, no Jupyter Notebook required.  

See the [Standard Python Package Installation](https://siliconlabs.github.io/mltk/docs/installation.html#standard-python-package) guide for more details on how to enable the `mltk` command in your local terminal.

In this mode, when you encounter a `!mltk` command in this tutorial, the command should actually run in your local terminal (excluding the `!`)

## Required Hardware

To play this tutorial's game using machine learning + keyword spotting, the [BRD2601](https://siliconlabs.github.io/mltk/docs/other/supported_hardware.html#brd2601) development board is required.

## Install MLTK Python Package

Before using the MLTK, it must first be installed.  
See the [Installation Guide](https://siliconlabs.github.io/mltk/docs/installation.html) for more details.

In [None]:
!pip install --upgrade silabs-mltk

All MLTK modeling operations are accessible via the `mltk` command.  
Run the command `mltk --help` to ensure it is working.  
__NOTE:__ The exclamation point `!` tells the Notebook to run a shell command, it is not required in a [standard terminal](https://siliconlabs.github.io/mltk/docs/installation.html#standard-python-package)

In [1]:
!mltk --help

Usage: mltk [OPTIONS] COMMAND [ARGS]...

  Silicon Labs Machine Learning Toolkit

  This is a Python package with command-line utilities and scripts to aid the
  development of machine learning models for Silicon Lab's embedded platforms.

Options:
  --version         Display the version of this mltk package and exit
  --gpu / --no-gpu  Disable usage of the GPU. 
                    This does the same as defining the environment variable: CUDA_VISIBLE_DEVICES=-1
                    Example:
                    mltk --no-gpu train image_example1
  --help            Show this message and exit.

Commands:
  build               MLTK build commands
  classify_audio      Classify keywords/events detected in a microphone's...
  classify_image      Classify images detected by a camera connected to...
  commander           Silab's Commander Utility
  compile             Compile a model for the specified accelerator
  custom              Custom Model Operations
  evaluate            Evaluate a t

## Prerequisite Reading

Before continuing with this tutorial, it is recommended to review the following documentation:
- [Keyword Spotting Overview](https://siliconlabs.github.io/mltk/docs/audio/keyword_spotting_overview.html) - Provides overview of how embedded keyword spotting works
- [Keyword Spotting Tutorial](https://siliconlabs.github.io/mltk/mltk/tutorials/keyword_spotting_on_off.html) - Provides an in-depth tutorial on how to create a keyword spotting model

## Creating the Machine Learning Model

The pre-defined [Model Specification](https://siliconlabs.github.io/mltk/docs/guides/model_specification.html) used by the tutorial may be found on [Github](https://github.com/SiliconLabs/mltk/tree/master/mltk/models/siliconlabs/keyword_spotting_pacman_v3.py).

This model is a standard audio classification model designed to detect the classes:  
- Left
- Right
- Up
- Down
- Stop
- Go
- _unknown_

Additionally, this model augments the training samples by adding audio recorded while playing the Pac-Man game. In this way, the model can be more robust to the background noise generated while playing the game.

Refer to the model, [keyword_spotting_pacman_v3](https://siliconlabs.github.io/mltk/docs/python_api/models/siliconlabs/keyword_spotting_pacman_v3.html) for more details.

### Select the dataset

This model was trained using several different datasets:  
- [mltk.datasets.audio.direction_commands](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/direction_commands.html) - Synthetically generated keywords: left, right, up, down, stop, go
- [mltk.datasets.audio.speech_commands_v2](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/speech_commands_v2.html) - Human generated keywords: left, right, up, down, stop, go
- [mltk.datasets.audio.mlcommons.ml_commons_keyword](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/ml_commons/keywords.html) - Large collection of keywords, random subset used for *unknown* class
- [mltk.datasets.audio.background_noise.esc50](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/background_noise/esc50.html) - Collection of various noises, random subset used for *unknown* class
- [mltk.datasets.audio.background_noise.ambient](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/background_noise/ambient.html) - Collection of various background noises, mixed into other samples for augmentation
- [mltk.datasets.audio.background_noise.brd2601](https://siliconlabs.github.io/mltk/docs/python_api/datasets/audio/background_noise/brd2601.html) - "Silence" recorded by BRD2601 microphone, mixed into other samples to make them "sound" like they came from the BRD2601's microphone
- Pac-Man game noise - Recording from Pac-Man game play, mixed into other samples for augmentation



### Model Parameter Tradeoffs

We have two main requirements when choosing the model parameters:
- We want the spectrogram resolution and convolutional filters to be as high as possible so that the model can make accurate predictions
- We want the model's computational complexity to be as small as possible so that inference latency is small and keywords are quickly detected while playing the game

Note that the larger the spectrogram resolution, the larger the model's input size and thus the larger the model's computational complexity. Likewise, more convolution filters also increases the model's computational complexity. As such, we need to find a middle ground for these parameters.

The MLTK offers two tools that can help when choosing these parameters:  
- [Model Profiler](https://siliconlabs.github.io/mltk/docs/guides/model_profiler.html) - This allows for profiling the model on the embedded device and determining the inference latency __before__ fully training the model
- [Audio Visualizer Utility](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html#audio-visualizer-utility) - This allows for visualizing the generated spectrograms in real-time

### Audio Feature Generator Settings

This model uses the following [Audio Feature Generator](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html) settings:

In [2]:
from mltk.core.preprocess.audio.audio_feature_generator import AudioFeatureGeneratorSettings

frontend_settings = AudioFeatureGeneratorSettings()

frontend_settings.sample_rate_hz = 16000
frontend_settings.sample_length_ms = 1000                       # A 1s buffer should be enough to capture the keywords
frontend_settings.window_size_ms = 30
frontend_settings.window_step_ms = 10
frontend_settings.filterbank_n_channels = 104                   # We want this value to be as large as possible
                                                                # while still allowing for the ML model to execute efficiently on the hardware
frontend_settings.filterbank_upper_band_limit = 7500.0
frontend_settings.filterbank_lower_band_limit = 125.0           # The dev board mic seems to have a lot of noise at lower frequencies

frontend_settings.noise_reduction_enable = True                 # Enable the noise reduction block to help ignore background noise in the field
frontend_settings.noise_reduction_smoothing_bits = 10
frontend_settings.noise_reduction_even_smoothing =  0.025
frontend_settings.noise_reduction_odd_smoothing = 0.06
frontend_settings.noise_reduction_min_signal_remaining = 0.40   # This value is fairly large (which makes the background noise reduction small)
                                                                # But it has been found to still give good results
                                                                # i.e. There is still some background noise reduction,
                                                                # but the actual signal is still (mostly) untouched

frontend_settings.dc_notch_filter_enable = True                 # Enable the DC notch filter, to help remove the DC signal from the dev board's mic
frontend_settings.dc_notch_filter_coefficient = 0.95

frontend_settings.quantize_dynamic_scale_enable = True          # Enable dynamic quantization, this dynamically converts the uint16 spectrogram to int8
frontend_settings.quantize_dynamic_scale_range_db = 40.0


This uses a 16kHz sample rate which was found to give better performance at the expense of more RAM.

```python
frontend_settings.sample_rate_hz = 16000
```

To help reduce the model computational complexity, only a 1000ms sample length is used. 

```python
frontend_settings.sample_length_ms = 1000
```

The idea here is that it only takes at most ~1000ms to say any of the keywords (i.e. the audio buffer needs to be large enough to hold the entire keyword but no larger). 

This model uses a window size of 30ms and a step of 10ms.

```python
frontend_settings.window_size_ms = 30
frontend_settings.window_step_ms = 10
```

These values were found experimentally using the [Audio Visualizer Utility](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html#audio-visualizer-utility).

104 frequency bins are used to generate the spectrogram:

```python
frontend_settings.filterbank_n_channels = 104
```

Increasing this value improves the resolution of spectrogram at the cost of model computational complexity (i.e. inference latency).

The noise reduction block is enabled but uses a fairly large `min_signal_remaining`:
```python
frontend_settings.noise_reduction_enable = True
frontend_settings.noise_reduction_smoothing_bits = 10
frontend_settings.noise_reduction_even_smoothing =  0.025
frontend_settings.noise_reduction_odd_smoothing = 0.06
frontend_settings.noise_reduction_min_signal_remaining = 0.40
```

This helps to reduce background noise in the field.  
__NOTE:__ We also add padding to the audio samples during training to "warm up" the noise reduction block when generating the spectrogram using the 
[Audio Feature Generator](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html). See the `audio_pipeline_with_augmentations()`
function in [keyword_spotting_pacman_v3.py](https://siliconlabs.github.io/mltk/docs/python_api/models/siliconlabs/keyword_spotting_pacman_v3.html#model-specification) for more details.



The DC notch filter was enabled to help remove the DC component from the development board's microphone:
```python
frontend_settings.dc_notch_filter_enable = True # Enable the DC notch filter
frontend_settings.dc_notch_filter_coefficient = 0.95
```

Dynamic quantization was enabled to convert the generated spectrogram from uint16 to int8

```python
frontend_settings.quantize_dynamic_scale_enable = True # Enable dynamic quantization
frontend_settings.quantize_dynamic_scale_range_db = 40.0
```

### Module Architecture


The model is based on the [Temporal efficient neural network (TENet)](https://arxiv.org/pdf/2010.09960.pdf) model architecture.  
> A network for processing spectrogram data using temporal and depthwise convolutions. The network treats the [T, F] spectrogram as a timeseries shaped [T, 1, F].

More details at [mltk.models.shared.tenet.TENet](https://siliconlabs.github.io/mltk/docs/python_api/models/common_models.html#tenet)


In [None]:
def my_model_builder(model: MyModel) -> tf.keras.Model:
    """Build the Keras model
    """
    input_shape = model.input_shape
    # NOTE: This model requires the input shape: <time, 1, features>
    #       while the embedded device expects: <time, features, 1>
    #       Since the <time> axis is still row-major, we can swap the <features> with 1 without issue
    time_size, feature_size, _ = input_shape
    input_shape = (time_size, 1, feature_size)

    keras_model = tenet.TENet12(
        input_shape=input_shape,
        classes=model.n_classes,
        channels=40,
        blocks=5,
    )

    keras_model.compile(
        loss='categorical_crossentropy',
        optimizer=tf.keras.optimizers.Adam(learning_rate=0.001, epsilon=1e-8),
        metrics= ['accuracy']
    )

    return keras_model

The main parameters to modify are:

```python
channels = 40
blocks = 5
```

`channels` sets the base number of channels in the network.  
`block` set the number of `(StridedIBB -> IBB -> ...)` blocks in the networks.

The larger these values are, the more trainable parameters the model will have which should allow for it to have better accuracy. 
However, increasing this value also increases the model's computational complexity which increases the model inference latency.

### Audio Data Generator

This model has an additional requirement that the keywords need to be said while the Pac-Man video game noises are generated in the background. As such, the model is trained by taking each keyword sample and adding a snippet of background noise to the sample. In this way, the model learns to pick out the keywords from the Pac-Man video game's noises.

The Pac-Man game audio was acquired by recording during game play (using the arrows on the keyboard). Recording was done using the MLTK command:

```
mltk classify_audio keyword_spotting_pacman_v3 --dump-audio --device
```

This command uses the microphone on the development board to record the video game's generated audio. The recorded audio is saved to the local PC as a `.wav` file.

The [model specification](https://github.com/SiliconLabs/mltk/tree/master/mltk/models/siliconlabs/keyword_spotting_pacman_v3.py) file was then modified to apply random augmentations to the dataset samples and then [generate spectrograms](https://siliconlabs.github.io/mltk/docs/python_api/data_preprocessing/audio.html#mltk.core.preprocess.utils.audio.apply_frontend) from the augmented samples.
The spectrograms are given to the model for training.

__NOTE:__ The spectrogram generation algorithm [source code](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html) is shared between the model training script and embedded runtime. This way, the generated spectrograms "look" the same during training and inference which should make the model more robust in the field.

In [None]:
def audio_pipeline_with_augmentations(
    path_batch:np.ndarray,
    label_batch:np.ndarray,
    seed:np.ndarray
) -> np.ndarray:
    """Augment a batch of audio clips and generate spectrograms

    This does the following, for each audio file path in the input batch:
    1. Read audio file
    2. Adjust its length to fit within the specified length
    3. Apply random augmentations to the audio sample using audiomentations
    4. Convert to the specified sample rate (if necessary)
    5. Generate a spectrogram from the augmented audio sample
    6. Dump the augmented audio and spectrogram (if necessary)

    NOTE: This will be execute in parallel across *separate* subprocesses.

    Arguments:
        path_batch: Batch of audio file paths
        label_batch: Batch of corresponding labels
        seed: Batch of seeds to use for random number generation,
            This ensures that the "random" augmentations are reproducible

    Return:
        Generated batch of spectrograms from augmented audio samples
    """
    batch_length = path_batch.shape[0]
    height, width = frontend_settings.spectrogram_shape
    x_shape = (batch_length, height, 1, width)
    x_batch = np.empty(x_shape, dtype=np.int8)

    # This is the amount of padding we add to the beginning of the sample
    # This allows for "warming up" the noise reduction block
    padding_length_ms = 1000
    padded_frontend_settings = frontend_settings.copy()
    padded_frontend_settings.sample_length_ms += padding_length_ms

    # For each audio sample path in the current batch
    for i, (audio_path, labels) in enumerate(zip(path_batch, label_batch)):
        class_id = np.argmax(labels)
        np.random.seed(seed[i])

        rn = np.random.random()
        # 3% of the time we want to replace the "unknown" sample with silence
        if class_id == unknown_class_id and rn < 0.03:
            original_sample_rate = frontend_settings.sample_rate_hz
            sample = np.zeros((original_sample_rate,), dtype=np.float32)
            audio_path = 'silence.wav'.encode('utf-8')
        else:
            # Read the audio file
            try:
                sample, original_sample_rate = audio_utils.read_audio_file(audio_path, return_numpy=True, return_sample_rate=True)
            except Exception as e:
                raise RuntimeError(f'Failed to read: {audio_path}, err: {e}')

        # Create a buffer to hold the padded sample
        padding_length = int((original_sample_rate * padding_length_ms) / 1000)
        padded_sample_length = int((original_sample_rate * padded_frontend_settings.sample_length_ms) / 1000)
        padded_sample = np.zeros((padded_sample_length,), dtype=np.float32)


        # Adjust the audio clip to the length defined in the frontend_settings
        out_length = int((original_sample_rate * frontend_settings.sample_length_ms) / 1000)
        sample = audio_utils.adjust_length(
            sample,
            out_length=out_length,
            trim_threshold_db=30,
            offset=np.random.uniform(0, 1)
        )
        padded_sample[padding_length:padding_length+len(sample)] += sample



        # Initialize the global audio augmentations instance
        # NOTE: We want this to be global so that we only initialize it once per subprocess
        audio_augmentations = globals().get('audio_augmentations', None)
        if audio_augmentations is None:
            audio_augmentations = audiomentations.Compose(
                p=1.0,
                transforms=[
                audiomentations.Gain(min_gain_in_db=0.95, max_gain_in_db=1.2, p=1.0),
                audiomentations.AddBackgroundNoise(
                    f'{dataset_dir}/_background_noise_/ambient',
                    min_snr_in_db=-1, # The lower the SNR, the louder the background noise
                    max_snr_in_db=35,
                    noise_rms="relative",
                    lru_cache_size=50,
                    p=0.80
                ),
                audiomentations.AddBackgroundNoise(
                    f'{dataset_dir}/_background_noise_/pacman',
                    min_absolute_rms_in_db=-60,
                    max_absolute_rms_in_db=-35,
                    noise_rms="absolute",
                    lru_cache_size=50,
                    p=0.50
                ),
                audiomentations.AddBackgroundNoise(
                    f'{dataset_dir}/_background_noise_/brd2601',
                    min_absolute_rms_in_db=-75.0,
                    max_absolute_rms_in_db=-60.0,
                    noise_rms="absolute",
                    lru_cache_size=50,
                    p=1.0
                ),
                #audiomentations.AddGaussianSNR(min_snr_in_db=25, max_snr_in_db=40, p=0.25),
            ])
            globals()['audio_augmentations'] = audio_augmentations

        # Apply random augmentations to the audio sample
        augmented_sample = audio_augmentations(padded_sample, original_sample_rate)

        # Convert the sample rate (if necessary)
        if original_sample_rate != frontend_settings.sample_rate_hz:
            augmented_sample = audio_utils.resample(
                augmented_sample,
                orig_sr=original_sample_rate,
                target_sr=frontend_settings.sample_rate_hz
            )

        # Ensure the sample values are within (-1,1)
        augmented_sample = np.clip(augmented_sample, -1.0, 1.0)

        # Generate a spectrogram from the augmented audio sample
        spectrogram = audio_utils.apply_frontend(
            sample=augmented_sample,
            settings=padded_frontend_settings,
            dtype=np.int8
        )

        # The input audio sample was padded with padding_length_ms of background noise
        # Drop the padded background noise from the final spectrogram used for training
        spectrogram = spectrogram[-height:, :]
        # The output spectrogram is 2D, add a channel dimension to make it 3D:
        # (height, width, channels=1)

        # Convert the spectrogram dimension from
        # <time, features> to
        # <time, 1, features>
        spectrogram = np.expand_dims(spectrogram, axis=-2)

        x_batch[i] = spectrogram

    return x_batch

### Profiling the model

Before training a machine learning model, it is important to know how efficiently the model will execute on the embedded target. This is especially true when using keyword spotting to control a Pac-Man (a keyword that takes > 1s to detect will not be useful when trying to avoid the ghosts).

If the model inference takes too long to execute on the embedded target, then the model parameters need to be decreased to reduce the model's computational complexity. The desired model parameters should be known before the model is fully trained.

To help determine the best model parameters, the MLTK features a [Model Profiler](https://siliconlabs.github.io/mltk/docs/guides/model_profiler.html) command:

In [3]:
!mltk profile keyword_spotting_pacman_v3 --device --build --accelerator MVP

Profiling ML model on device ...

Profiling Summary
Name: my_model
Accelerator: MVP
Input Shape: 1x98x1x104
Input Data Type: int8
Output Shape: 1x7
Output Data Type: int8
Flash, Model File Size (bytes): 446.5k
RAM, Runtime Memory Size (bytes): 76.7k
Operation Count: 12.4M
Multiply-Accumulate Count: 6.0M
Layer Count: 90
Unsupported Layer Count: 0
Accelerator Cycle Count: 5.3M
CPU Cycle Count: 954.1k
CPU Utilization (%): 16.7
Clock Rate (hz): 78.0M
Time (s): 73.3m
Ops/s: 169.2M
MACs/s: 82.1M
Inference/s: 13.7

Model Layers
+-------+-------------------+--------+--------+------------+------------+----------+--------------------------+--------------+------------------------------------------------------+
| Index | OpCode            | # Ops  | # MACs | Acc Cycles | CPU Cycles | Time (s) | Input Shape              | Output Shape | Options                                              |
+-------+-------------------+--------+--------+------------+------------+----------+-------------------------

This command builds the model then profiles it on the development board using the [MVP](https://docs.silabs.com/gecko-platform/latest/machine-learning/tensorflow/mvp-accelerator) hardware accelerator.

### Training the model

Once the [model specification](https://github.com/SiliconLabs/mltk/tree/master/mltk/models/siliconlabs/keyword_spotting_pacman_v3.py) is ready, it can be [trained](https://siliconlabs.github.io/mltk/docs/guides/model_training.html) with the command:

In [None]:
!mltk train keyword_spotting_pacman_v3

### Train in cloud

Alternatively, you can _vastly_ improve the model training time by training this model in the "cloud".  
See the tutorial: [Cloud Training with vast.ai](https://siliconlabs.github.io/mltk/mltk/tutorials/cloud_training_with_vast_ai.html) for more details.

After training completes, a [model archive](https://github.com/SiliconLabs/mltk/tree/master/mltk/models/siliconlabs/keyword_spotting_pacman_v3.mltk.zip) file is generated containing the quantized `.tflite` model file. This is the file that is built into the firmware application.

## Creating the Firmware Application

The [BLE Audio Classifier](https://siliconlabs.github.io/mltk/docs/cpp_development/examples/ble_audio_classifier.html) C++ example application may be used with the train model.

The application uses the [Audio Feature Generator](https://siliconlabs.github.io/mltk/docs/audio/audio_feature_generator.html#gecko-sdk-component) library to generate spectrograms from the streaming microphone audio.
The spectrograms are then passed to the [Tensorflow-Lite Micro](https://github.com/tensorflow/tflite-micro) inference engine which uses the trained model from above to make predictions on if a keyword is found in the spectrogram.
If a keyword is detected, a connected BLE client is sent a notification containing the detected class ID of the keyword and prediction probability.

## Creating the Pac-Man Webpage

A [Pac-Man Webpage](https://github.com/SiliconLabs/mltk/blob/master/cpp/shared/apps/ble_audio_classifier/web/pacman) is available that allows for playing the game "Pac-Man" using
the keywords detected by the firmware application described above.

This webpage was adapted from a game created by Lucio Panpinto, view original source code on [GitHub](https://github.com/luciopanepinto/pacman).

The webpage was modified to use the [p5.ble.js](https://itpnyu.github.io/p5ble-website/) library for communicating with the firmware application via Bluetooth Low Energy.


## Running the Demo

With the following components complete:  
- Keyword spotting machine learning model
- Firmware application with Audio Feature Generator, Tensorflow-Lite Micro, and Bluetooth libraries
- Pac-Man webpage with Bluetooth

We can now run the demo.

A live demo may be found at: [https://mltk-pacman.web.app](https://mltk-pacman.web.app).

Alternatively, you can build the firmware application from source and run the webpage locally:

### Build firmware application from source

The MLTK supports building [C++ Applications](https://siliconlabs.github.io/mltk/docs/cpp_development/index.html).

It also features a [ble_audio_classifier](https://siliconlabs.github.io/mltk/docs/cpp_development/examples/ble_audio_classifier.html) C++ application
which can be built using:  
- [Visual Studio Code](https://siliconlabs.github.io/mltk/docs/cpp_development/vscode.html) 
- [Simplicity Studio](https://siliconlabs.github.io/mltk/docs/cpp_development/simplicity_studio.html)
- [Command Line](https://siliconlabs.github.io/mltk/docs/cpp_development/command_line.html)

Refer to the [ble_audio_classifier](https://siliconlabs.github.io/mltk/docs/cpp_development/examples/ble_audio_classifier.html) application's documentation
for how include your model into the built application.

### Run webpage locally

The demo's webpage uses "vanilla" javascript+css+html. No special build systems are required.

To run the webpage locally, simply open [index.html](https://github.com/SiliconLabs/mltk/blob/master/cpp/shared/apps/ble_audio_classifier/web/pacman/index.html) in your web browser (NOTE: double-click the __locally cloned__ `index.html` on your PC, not the one on Github).

When the webpage starts, follow the instructions but do _not_ program the `.s37`. The locally built firmware application should have already been programmed as described in the the previous section.