## Training Your Custom Dataset Keyword Spotting Model 
## Using 4-6-8-CustomDatasetKWSModel-rev4-part2.ipynb

*Important  Please Read This!*

This notebook is the second of the custom dataset KWS Model notebooks.  While part 1 was run in Colab, the second  will be run locally on the User's PC.  The assumption is that the User is using Fedora and has Anaconda installed.  For more information, check out the project repository <a href="https://github.com/john-mangiaracina/TinyML-CustomKeywordSpotting/tree/main">here</a>.  Note:  any RHEL-based distro, such as Fedora, CentOS, etc. should work.  Only Fedora 37 and Fedora 38 have been tested.

### Import packages
Clone the TensorFlow Github Repository, which contains the relevant code required to run this assignment.


In [None]:
!wget https://github.com/tensorflow/tensorflow/archive/v2.4.1.zip
!unzip v2.4.1.zip &> 0
!mv tensorflow-2.4.1/ tensorflow/

In [None]:
#  Let's check our version of Python used within conda
#  This should say 3.7.16.  If it doesn't, you are not in the right environment
# and need to go back to the readme in the repo.

import sys

print("Python version")
print(sys.version)
#print(tf.__version__)

In [None]:
import tensorflow as tf
print(tf.__version__)

####   This should show an error.  Tensorflow should not be pre-installed.  If it is, return to the readme in the repo.

####   Let's now install Tensorflow 1.15.0  This will take a few minutes.

In [None]:
!conda install tensorflow=1.15 -y

##  Important!

#### Restart the kernel, either from the dropdown menu or from the half circle with an arrow

In [None]:
#  Now let's check the TensorFlow version.  It should show 1.15.0

import tensorflow as tf
print(tf.__version__)
#print(tf.version.VERSION)

In [None]:
#  check whether you are using a 32 or 64 bit system.  TF needs 64 bits

import struct
print(struct.calcsize("P") * 8)

####  Now let's install the google colab package.  Use pip

In [None]:
#!conda install -c conda-forge google-colab 
!pip install google-colab 

In [None]:
#  Now let's check the TensorFlow version.  It should show 1.15.0

import tensorflow as tf
!pip show tensorflow

In [None]:
#  verify google colab packages installed

!pip show google-colab 

##  Important!   Before the next code is run, make sure you have unzip your custom dataset download.  Within the unzipped folder, there will be a folder that says 'dataset'.  Move this folder into the same directory as you have this jupyter notebook.


In [None]:
#  Now we have to edit some files within the Python 3.7 lib within your virtual dev env
#  This assumes that yopu have followed the default suggestions and installed anaconda in the default dir 

!sed -i 's/from traitlets import */import traitlets /g' ~/anaconda3/envs/myenv37/lib/python3.7/site-packages/IPython/utils/traitlets.py 
!sed -i 's/from IPython.utils import traitlets as _traitlets/import traitlets as _traitlets/g' ~/anaconda3/envs/myenv37/lib/python3.7/site-packages/google/colab/data_table.py

In [None]:
# Note:The install assumes that the files downloaded for part1 and this notebook are in the same directory.

import sys

# We add this path so we can import the speech processing modules.
sys.path.append("tensorflow/tensorflow/examples/speech_commands/")
import input_data
import models
import numpy as np
import glob
import os
import re
import shutil
from google.colab import files

In [None]:
!pip install ffmpeg-python &> 0

### Configure Your Model!
Ok now that your custom dataset is all ready to go we'll need to select your keywords and model settings with which to train!

```WANTED_WORDS``` = A comma-delimited string of the words you want to train for (e.g., "yes,no"). 

**Make sure to input the keywords you collected!  Also, make sure to have no spaces within the string.  The string must only contain letters and commas.**

In [None]:
WANTED_WORDS = "pizza,coke,bread"   #  do not have something like "pizza,  coke,   bread"  it won't work!

The number of training steps and learning rates can be specified as comma-separated strings to define the amount/rate at each stage. For example, ```TRAINING_STEPS="12000,3000"``` and ```LEARNING_RATE="0.001,0.0001"``` will run 12,000 training steps with a rate of 0.001 followed by 3,000 final steps with a learning rate of 0.0001. These are good default values to work off of when you choose your values as the course staff has gotten this to work well with those values in the past!

In [None]:
TRAINING_STEPS = "30000,4000"   #  This will take a while, setup before dinner and watching some ball
LEARNING_RATE = "0.001,0.0001"

We suggest you leave the ```MODEL_ARCHITECTURE``` as tiny_conv the first time but if you would like to do this again and explore additional models some options are: ```single_fc, conv, low_latency_conv, low_latency_svdf, tiny_embedding_conv```. **Do remember if you switch the model type you may need to update the C++ code to include the ```tflite::AllOpsResolver``` to make sure you have all of the neccessary ops!**

In [None]:
MODEL_ARCHITECTURE = 'tiny_conv'

In [None]:
# Calculate the total number of steps, which is used to identify the checkpoint
# file name.
TOTAL_STEPS = str(sum(map(lambda string: int(string), TRAINING_STEPS.split(","))))

# Print the configuration to confirm it
print("Training these words: %s" % WANTED_WORDS)
print("Training steps in each stage: %s" % TRAINING_STEPS)
print("Learning rate in each stage: %s" % LEARNING_RATE)
print("Total number of training steps: %s" % TOTAL_STEPS)

**We suggest that you do not modify** the following constants as they include filepaths used in this notebook and data that is shared during training and inference.

In [None]:
# Calculate the percentage of 'silence' and 'unknown' training samples required
# to ensure that we have equal number of samples for each label.
number_of_labels = WANTED_WORDS.count(',') + 1
number_of_total_labels = number_of_labels + 2 # for 'silence' and 'unknown' label
equal_percentage_of_training_samples = int(100.0/(number_of_total_labels))
SILENT_PERCENTAGE = equal_percentage_of_training_samples
UNKNOWN_PERCENTAGE = equal_percentage_of_training_samples

# Constants used during training only
VERBOSITY = 'DEBUG'
EVAL_STEP_INTERVAL = '1000'
SAVE_STEP_INTERVAL = '1000'

# Constants for training directories and filepaths
LOGS_DIR = 'logs/'
TRAIN_DIR = 'train/' # for training checkpoints and other files.

# Constants for inference directories and filepaths
import os
MODELS_DIR = 'models'
if not os.path.exists(MODELS_DIR):
  os.mkdir(MODELS_DIR)
MODEL_TF = os.path.join(MODELS_DIR, 'KWS_custom.pb')
MODEL_TFLITE = os.path.join(MODELS_DIR, 'KWS_custom.tflite')
FLOAT_MODEL_TFLITE = os.path.join(MODELS_DIR, 'KWS_custom_float.tflite')
MODEL_TFLITE_MICRO = os.path.join(MODELS_DIR, 'KWS_custom.cc')
SAVED_MODEL = os.path.join(MODELS_DIR, 'KWS_custom_saved_model')

**Be careful if you modify** the following constants as they will have downstream effects on the C++ code which you will then have to change. This mainly relate to hyperparaemeters for quantization and preprocessing. The first time you train a custom model **we suggest you do not modify these as well.**

In [None]:
# Constants which are shared during training and inference
PREPROCESS = 'micro'
WINDOW_STRIDE = 20

# Constants for Quantization
QUANT_INPUT_MIN = 0.0
QUANT_INPUT_MAX = 26.0
QUANT_INPUT_RANGE = QUANT_INPUT_MAX - QUANT_INPUT_MIN

# Constants for audio process during Quantization and Evaluation
SAMPLE_RATE = 16000
CLIP_DURATION_MS = 1000
WINDOW_SIZE_MS = 30.0
FEATURE_BIN_COUNT = 40
BACKGROUND_FREQUENCY = 0.8
BACKGROUND_VOLUME_RANGE = 0.1
TIME_SHIFT_MS = 100.0

# Use the custom local dataset and set the tes/val/train split
DATA_URL = ''
VALIDATION_PERCENTAGE = 10
TESTING_PERCENTAGE = 10

In [None]:
#  Let's print out some things as part of debugging
#  I have left this because others may find it helpful

print(SILENT_PERCENTAGE)
print(UNKNOWN_PERCENTAGE)
print(WINDOW_STRIDE)
print(EVAL_STEP_INTERVAL)
print(SAVE_STEP_INTERVAL)

In [None]:
#  part of debugging process to try to figure out what the blazes is going on

SILENT_PERCENTAGE = float(SILENT_PERCENTAGE)
UNKNOWN_PERCENTAGE = float(UNKNOWN_PERCENTAGE)
WINDOW_STRIDE = float(WINDOW_STRIDE)

print(SILENT_PERCENTAGE)
print(UNKNOWN_PERCENTAGE)
print(WINDOW_STRIDE)

# Train the model

### Load in TensorBoard to visulaize the training process.

As training progresses you should see the training status show up in the Tensorboard area. If this works it is very helpful for analyzing your training progress. Unfortunately, the staff has found that it sometimes doesn't start showing data for a while (~15 minutes) and sometimes doesn't show data until training completes (and instead shows ```No dashboards are active for the current data set```.). If it is working and then stops updating look to the top of the cell and click reconnect.

--->>> Tensorboard works great until it doesn't

In [None]:
DATASET_DIR='dataset'

In [None]:
%load_ext tensorboard
%tensorboard --logdir {LOGS_DIR}

### Launch Training

--->>>  Note:  We will *not* be using a GPU.  Specific nVidia solutions are required for Anaconda.  I burned up a lot of cycles figuring out my GPU is not supported.  Just fuhgeddaboudit!

The original writeup stated 10 hours on Colab with no GPU, 2 hours with GPU.  I believe them, but that is with the Colab Dev Env.

When I ran this on a ThinkPad, the training portion took around 1 hour and 5 minutes to run 15,000 training cycles.  Since it is on your own system, you don't need to worry about being timed out.  Do something else while is runs. 

Most likely, your computer will complete this task much faster then mine.

If you follow my settings, you should be OK.  Of course, feel free to experiment! That's where things get really fun.

If you would like to get more information on the training script you can find the source code for the script [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/speech_commands/train.py). In short it sets up the optimizer and preprocessor based on all of the flags we pass in!

*Note:  the above paragraph should really be required.  It is worth your time to check out the default values.*

Finally, by setting the ```VERBOSITY = 'DEBUG'``` above be aware that the training cell will print A LOT of information. Specifically you will get the accuracy and loss at each step as well as a confusion matrix every 1000 steps. We hope that is helpful in case TensorBoard fails to work. If you would like to run with less printouts you can change the setting to ```WARN``` or ```FATAL```. You will find this in the "Configure Your Model!" section.

*Note:  Keeping the setting at DEBUG jacks up the  uP cycles devoted to i/o but is extremely helpful.  I suggest keeping it.*

In [None]:
!python tensorflow/tensorflow/examples/speech_commands/train.py \
  --data_dir={DATASET_DIR} \
  --data_url='' \
  --wanted_words={WANTED_WORDS} \
  --silence_percentage=20.0 \
  --unknown_percentage=20.0 \
  --preprocess='micro' \
  --window_stride=20.0 \
  --model_architecture={MODEL_ARCHITECTURE} \
  --how_many_training_steps={TRAINING_STEPS} \
  --learning_rate={LEARNING_RATE} \
  --train_dir={TRAIN_DIR} \
  --summaries_dir={LOGS_DIR} \
  --verbosity='DEBUG' \
  --eval_step_interval=1000 \
  --save_step_interval=1000

# Generating your Model
Just like with the pre-trained model we will now take the final checkpoint and convert it into a quantized TensorFlow Lite model.

### Generate a TensorFlow Model for Inference

Combine relevant training results (graph, weights, etc) into a single file for inference. This process is known as freezing a model and the resulting model is known as a frozen model/graph, as it cannot be further re-trained after this process.

In [None]:
!rm -rf {SAVED_MODEL}
!python tensorflow/tensorflow/examples/speech_commands/freeze.py \
--wanted_words=$WANTED_WORDS \
--window_stride_ms=$WINDOW_STRIDE \
--preprocess=$PREPROCESS \
--model_architecture=$MODEL_ARCHITECTURE \
--start_checkpoint=$TRAIN_DIR$MODEL_ARCHITECTURE'.ckpt-'{TOTAL_STEPS} \
--save_format=saved_model \
--output_file={SAVED_MODEL}

### Generate a TensorFlow Lite Model

Convert the frozen graph into a TensorFlow Lite model, which is fully quantized for use with embedded devices. The following cell will also print the model size.

In [None]:
model_settings = models.prepare_model_settings(
    len(input_data.prepare_words_list(WANTED_WORDS.split(','))),
    SAMPLE_RATE, CLIP_DURATION_MS, WINDOW_SIZE_MS,
    WINDOW_STRIDE, FEATURE_BIN_COUNT, PREPROCESS)
audio_processor = input_data.AudioProcessor(
    DATA_URL, DATASET_DIR,
    SILENT_PERCENTAGE, UNKNOWN_PERCENTAGE,
    WANTED_WORDS.split(','), VALIDATION_PERCENTAGE,
    TESTING_PERCENTAGE, model_settings, LOGS_DIR)

**Note: if the below cell fails it might be because you do not have enough data to have 100 recordings in the representative dataset!** If this happens you will see an error that says something like ```ValueError: cannot reshape array of size 0 into shape (1,1960)```. To help you fix this we have added a ```print(i)``` into the loop. As such, all you have to do is change the ```REP_DATA_SIZE``` variable to be equal to the last integer value printed out by the loop and then re-run the cell!

In [None]:
#  used for debugging

print(SILENT_PERCENTAGE)
print(PREPROCESS)
print(WINDOW_STRIDE)
print(VERBOSITY)
print(EVAL_STEP_INTERVAL)
print(SAVE_STEP_INTERVAL)
print(DATA_URL)
print(BACKGROUND_FREQUENCY)
print(BACKGROUND_VOLUME_RANGE)
print(TIME_SHIFT_MS)

In [None]:
REP_DATA_SIZE = 23  #   <---  Don't be surprised if you have to change this value


with tf.Session() as sess:
  # with tf.compat.v1.Session() as sess: #replaces the above line for use with TF2.x
  float_converter = tf.lite.TFLiteConverter.from_saved_model(SAVED_MODEL)
  float_tflite_model = float_converter.convert()
  float_tflite_model_size = open(FLOAT_MODEL_TFLITE, "wb").write(float_tflite_model)
  print("Float model is %d bytes" % float_tflite_model_size)

  converter = tf.lite.TFLiteConverter.from_saved_model(SAVED_MODEL)
  converter.optimizations = [tf.lite.Optimize.DEFAULT]
  converter.inference_input_type = tf.lite.constants.INT8
  # converter.inference_input_type = tf.compat.v1.lite.constants.INT8 #replaces the above line for use with TF2.x   
  converter.inference_output_type = tf.lite.constants.INT8
  # converter.inference_output_type = tf.compat.v1.lite.constants.INT8 #replaces the above line for use with TF2.x
  def representative_dataset_gen():
    for i in range(REP_DATA_SIZE):
      data, _ = audio_processor.get_data(1, i*1, model_settings,
                                         BACKGROUND_FREQUENCY, 
                                         BACKGROUND_VOLUME_RANGE,
                                         TIME_SHIFT_MS,
                                         'testing',
                                         sess)
      flattened_data = np.array(data.flatten(), dtype=np.float32).reshape(1, 1960)
      print(i)
      yield [flattened_data]
  converter.representative_dataset = representative_dataset_gen
  tflite_model = converter.convert()
  tflite_model_size = open(MODEL_TFLITE, "wb").write(tflite_model)
  print("Quantized model is %d bytes" % tflite_model_size)



### Testing the accuracy after Quantization

Verify that the model we've exported is still accurate, using the TF Lite Python API and our test set.

In [None]:
# Helper function to run inference
def run_tflite_inference_testSet(tflite_model_path, model_type="Float"):
  #
  # Load test data
  #
  np.random.seed(0) # set random seed for reproducible test results.
  with tf.Session() as sess:
    # with tf.compat.v1.Session() as sess: #replaces the above line for use with TF2.x
    test_data, test_labels = audio_processor.get_data(
        -1, 0, model_settings, BACKGROUND_FREQUENCY, BACKGROUND_VOLUME_RANGE,
        TIME_SHIFT_MS, 'testing', sess)
  test_data = np.expand_dims(test_data, axis=1).astype(np.float32)

  #
  # Initialize the interpreter
  #
  interpreter = tf.lite.Interpreter(tflite_model_path)
  interpreter.allocate_tensors()
  input_details = interpreter.get_input_details()[0]
  output_details = interpreter.get_output_details()[0]
  
  #
  # For quantized models, manually quantize the input data from float to integer
  #
  if model_type == "Quantized":
    input_scale, input_zero_point = input_details["quantization"]
    test_data = test_data / input_scale + input_zero_point
    test_data = test_data.astype(input_details["dtype"])

  #
  # Evaluate the predictions
  #
  correct_predictions = 0
  for i in range(len(test_data)):
    interpreter.set_tensor(input_details["index"], test_data[i])
    interpreter.invoke()
    output = interpreter.get_tensor(output_details["index"])[0]
    top_prediction = output.argmax()
    correct_predictions += (top_prediction == test_labels[i])

  print('%s model accuracy is %f%% (Number of test samples=%d)' % (
      model_type, (correct_predictions * 100) / len(test_data), len(test_data)))

In [None]:
# Compute float model accuracy
run_tflite_inference_testSet(FLOAT_MODEL_TFLITE)

# Compute quantized model accuracy
run_tflite_inference_testSet(MODEL_TFLITE, model_type='Quantized')

### Generate a TensorFlow Lite for Microcontrollers Model
To convert the TensorFlow Lite quantized model into a C source file that can be loaded by TensorFlow Lite for Microcontrollers on Arduino we simply need to use the ```xxd``` tool to convert the ```.tflite``` file into a ```.cc``` file (just like we did in the pervious section).

In [None]:
!xxd -i {MODEL_TFLITE} > {MODEL_TFLITE_MICRO}
REPLACE_TEXT = MODEL_TFLITE.replace('/', '_').replace('.', '_')
!sed -i 's/'{REPLACE_TEXT}'/g_model/g' {MODEL_TFLITE_MICRO}

That's it! You've successfully converted your TensorFlow Lite model into a TensorFlow Lite for Microcontrollers model! Run the cell below to print out its contents which we'll need for our next step, deploying the model using the Arudino IDE!

In [None]:
!cat {MODEL_TFLITE_MICRO}

To download your model for use at a later date:

1. On the left of the UI click on the folder icon
2. Click on the three dots to the right of the ```.cc``` file you just generated and select "download." The file can be found at ```models/{MODEL_TFLITE_MICRO}``` which by default is ```models/KWS_custom.cc```

Next *you will* deploy that model using the Arduino IDE.