## Q1

Build a Multi-classifier Machine Learning Model to Switch On/Off Devices
using Voice Commands on Raspberry Pi 4

**Load dataset**

In [2]:
!pip install -U -q tensorflow tensorflow_datasets

In [4]:
import os
import pathlib
import matplotlib.pyplot as plt
import numpy as np
from numpy import random
import shutil, errno
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import models

In [5]:
DATASET_PATH = 'data/'
data_dir = pathlib.Path(DATASET_PATH)
if not data_dir.exists():
    tf.keras.utils.get_file(
    'speech_commands.zip',
    origin='http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz',
    extract=True,
    cache_dir='.', cache_subdir='data')

Downloading data from http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz


In [6]:
# explore labels
commands = np.array(tf.io.gfile.listdir(str('data')))
commands = commands[commands != 'README.md']
print('Commands:', commands)

Commands: ['right' 'eight' 'cat' 'tree' 'backward' 'learn' 'bed' 'happy' 'go'
 '.DS_Store' 'validation_list.txt' 'LICENSE' 'dog' 'no' 'wow' 'follow'
 'nine' 'left' 'stop' 'three' '_background_noise_' 'sheila' 'one' 'bird'
 'zero' 'seven' 'up' 'speech_commands.zip' 'visual' 'marvin' 'two' 'house'
 'down' 'six' 'yes' 'on' 'testing_list.txt' 'five' 'forward' 'off' 'four']


**Split directories for string "on", "off", "others", "silent"**

In [7]:
try:
    os.mkdir('data/mydataset')
except Exception as e:
    print("Error creating folder. Error details {}".format(e))

In [None]:
def copy_data(src, dst):
    try:
        shutil.copytree(src, dst)
    except OSError as exc: # python >2.5
        if exc.errno in (errno.ENOTDIR, errno.EINVAL):
            shutil.copy(src, dst)
        else: raise

# directories for on and off
copy_data('data/on','data/mydataset/on')
copy_data('data/off','data/mydataset/off')

In [12]:
# directories for others
os.mkdir('data/mydataset/others')
other_labels = ["yes", "no", "up", "down", "left", "right","bed", "bird", "cat", "dog", "happy", "house", "marvin", "sheila", "tree", "wow"]
sample_per_label = 150


for label in other_labels:
    path = 'data/'+label
    f = os.listdir(path)
    num_files = len(f)
    file_indx = np.arange(num_files)
    random.shuffle(file_indx)
    for i in range(sample_per_label):
        index = file_indx[i]
        file_name = f[index]
        source = path + '/' + file_name
        destination = 'data/mydataset/others/' + file_name
        # copy only files
        if os.path.isfile(source):
            shutil.copy(source, destination)

In [13]:
# directory for silent
os.mkdir('data/mydataset/silent')
import scipy
from scipy.io.wavfile import write

fs = 16000
num_files = 2400

for i in range(num_files):
    sample = np.zeros(fs)
    filename = str(i*100)+'silent.wav'
    sample = sample + 0.01*i*random.randn(fs)
    scipy.io.wavfile.write('data/mydataset/silent/'+filename, fs, sample.astype(np.int16))

### Submit the dataset you used. Organize your data to have the data for each label in a separate folder (as shown in the lecture demo).

Refer to mydataset

### Submit a copy of your pipeline . Customize the parameters as needed.

Refer to q1_1.png

### Submit a screenshot of your Feature Generation Output. Your screenshot should show 1) your name, 2) job completion in green and 3) Feature explorer as shown in the screenshot below.

Refer to q1_2.png

### Submit a Screenshot of the output of your classifier. Make sure your screenshot shows the target device, the results of the classification and your name. If you used enough data and reasonable model parameters, it is expected that your model’s accuracy will be greater than 90%

Refer to q1_3.png, q1_4.png

### Record a 1-minute video showing the live-classification on your smartphone or your laptop for switching on and off devices.

Reder to [live-classification](https://www.dropbox.com/scl/fi/zpyxsxbbfs489b9foi2gy/live-classification.mov?rlkey=1o9akxg788g8oqgqrhj7quiia&st=pa5r6zps&dl=0)

## Q2

Are you using the most optimal Neural Network architecture for Raspberry
Pi 4? Use EON Tuner to validate your answer. Submit a screenshot of the EON Tuner
and comment on your findings. Use 100ms target inference time for your calculations.

Refer to q2.png

The best EON tuner performs 91% accuracy with 1 ms of 100 ms latency. Its model architecture contains 4 convolutional layers with 32, 64, 128, and 256 filters respectively and one dropout layer with 0.5 dropping rate. However, the trained model performance is 93.4% which composes 3 1D-convolutional layers at 8, 16 and 64 filters with dropout in each layer.