# Training Your Custom Dataset Keyword Spotting Model

This Notebook ist based on HarvardX's [4-6-8-CustomDatasetKWSModel.ipynb](https://colab.research.google.com/github/tinyMLx/colabs/blob/master/4-6-8-CustomDatasetKWSModel.ipynb). This time we train our own custom keyword spotting model using your Custom Dataset!

In [39]:
%%bash
rm -rf tensorflow log  v2.4.1.zip logs models train dataset extract_loudest_section
apt-get update -qq && apt-get install -y --no-install-recommends wget unzip
wget https://github.com/tensorflow/tensorflow/archive/v2.4.1.zip
unzip v2.4.1.zip &> log
mv tensorflow-2.4.1/ tensorflow/
rm -rf v2.4.1.zip log
python3 -m pip install --upgrade --quiet --no-cache-dir pip ffmpeg-python

Reading package lists...
Building dependency tree...
Reading state information...
zip is already the newest version (3.0-11build1).
unzip is already the newest version (6.0-21ubuntu1.1).
wget is already the newest version (1.19.4-1ubuntu2.2).
0 upgraded, 0 newly installed, 0 to remove and 45 not upgraded.


--2021-04-04 21:52:12--  https://github.com/tensorflow/tensorflow/archive/v2.4.1.zip
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/tensorflow/tensorflow/zip/v2.4.1 [following]
--2021-04-04 21:52:12--  https://codeload.github.com/tensorflow/tensorflow/zip/v2.4.1
Resolving codeload.github.com (codeload.github.com)... 140.82.121.10
Connecting to codeload.github.com (codeload.github.com)|140.82.121.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘v2.4.1.zip’

     0K .......... .......... .......... .......... ..........  666K
    50K .......... .......... .......... .......... .......... 4.15M
   100K .......... .......... .......... .......... .......... 1.72M
   150K .......... .......... .......... .......... .......... 34.9M
   200K .......... .......... ..

## Import Packages
Import standard packages as well as the additional packages from the cloned Github Repo.

In [37]:
import tensorflow as tf
import sys
# We add this path so we can import the speech processing modules.
sys.path.append("./tensorflow/tensorflow/examples/speech_commands/")
import input_data
import models
import numpy as np
import glob
import os
import re
import shutil

### Import your Custom Dataset
We will build our dataset upon Pete Warden's dataset (a base set of "other words" and "background noise").

In [3]:
# Set Data Set Path for Python
DATASET_DIR =  'dataset/'

In [38]:
%%bash
wget https://storage.googleapis.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz
mkdir dataset
tar -xf speech_commands_v0.02.tar.gz -C 'dataset'
rm -r -f speech_commands_v0.02.tar.gz

Process is interrupted.


 ## Import Recorded Audio Files
Now upload all of your previously recorded custom audio files (aka the *.ogg files) by using my [Open Speech Recording Tool Start Script](https://github.com/KlausPuchner/TinyML/blob/main/02_projects/keyword_spotting/customwords/start-open-speech-recording-app.sh). Please select all/multiple files to upload them all at once!

In [11]:
from ipywidgets import FileUpload
upload = FileUpload(multiple=True)
upload

FileUpload(value={}, description='Upload', multiple=True)

In [21]:
for name, file_info in upload.value.items():
    with open(name, 'wb') as fp:
        fp.write(file_info['content'])

# Convert .ogg Files to .wav
Now we convert them into correctly trimmed WAV files and then store them in the appropriate folders in the ```DATASET_DIR```.
We will use Pete's extract_loudest_section tool which you can find more documentation about here: https://github.com/petewarden/extract_loudest_section

In [31]:
%%bash
apt-get update -qqq && apt-get install -y --no-install-recommends -qqq git ffmpeg zip
rm -rf wavs
mkdir wavs
find *.ogg -print0 | xargs -0 basename -s .ogg | xargs -I {} ffmpeg -i {}.ogg -ar 16000 wavs/{}.wav
rm -rf *.ogg

# then use pete's tool to only extract 1 second clips from them for use with the KWS pipeline
mkdir trimmed_wavs
git clone https://github.com/petewarden/extract_loudest_section.git
make -C extract_loudest_section/
/tmp/extract_loudest_section/gen/bin/extract_loudest_section 'wavs/*.wav' trimmed_wavs/
rm -rf /wavs

make: Entering directory '/tf/notebooks/extract_loudest_section'
gcc --std=c++11 -O3 -DNDEBUG -I. -c main.cc -o /tmp/extract_loudest_section//gen/obj/./main.o
gcc --std=c++11 -O3 -DNDEBUG -I. -c status.cc -o /tmp/extract_loudest_section//gen/obj/./status.o
gcc --std=c++11 -O3 -DNDEBUG -I. -c wav_io.cc -o /tmp/extract_loudest_section//gen/obj/./wav_io.o
gcc  -I. \
-o /tmp/extract_loudest_section//gen/bin//extract_loudest_section /tmp/extract_loudest_section//gen/obj/./main.o /tmp/extract_loudest_section//gen/obj/./status.o /tmp/extract_loudest_section//gen/obj/./wav_io.o \
 -lstdc++ -lm
make: Leaving directory '/tf/notebooks/extract_loudest_section'


mkdir: cannot create directory ‘wavs’: File exists
ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-lib

In [32]:
# Store them in the appropriate folders
data_index = {}
os.chdir('trimmed_wavs')
search_path = os.path.join('*.wav')
for wav_path in glob.glob(search_path):
    matches = re.search('([^/_]+)_([^/_]+)\.wav', wav_path)
    if not matches:
        raise Exception('File name not in a recognized form:"%s"' % wav_path)
    word = matches.group(1).lower()
    instance = matches.group(2).lower()
    if not word in data_index:
      data_index[word] = {}
    if instance in data_index[word]:
        raise Exception('Audio instance already seen:"%s"' % wav_path)
    data_index[word][instance] = wav_path

output_dir = os.path.join('..', 'dataset')
try:
    os.mkdir(output_dir)
except:
    pass
for word in data_index:
  word_dir = os.path.join(output_dir, word)
  try:
      os.mkdir(word_dir)
      print('Created dir: ' + word_dir)
  except:
      print('Storing in existing dir: ' + word_dir)
  for instance in data_index[word]:
    wav_path = data_index[word][instance]
    output_path = os.path.join(word_dir, instance + '.wav')
    shutil.copyfile(wav_path, output_path)
os.chdir('..')
!rm -r -f trimmed_wavs

Created dir: ../dataset/76d9b80300824a058d6d1b5ce1af36fa


# (Optional) Download your Dataset
To zip and download your dataset run the code cell below.

In [36]:
!zip -r 3customworddataset.zip dataset
from IPython.display import display, FileLink

local_file = FileLink('./3customworddataset.zip', result_html_prefix="Click here to download: ")
display(local_file)


zip error: Nothing to do! (try: zip -r 3customworddataset.zip . -i dataset)
