## SRNN on Speech Commands Dataset


Please use `fetch_google.sh` to download the Google Speech Commands Dataset and python `process_google.py` to create feature extracted data.

In [1]:
from __future__ import print_function
import sys
import os
import numpy as np
import torch

In [2]:
from edgeml_pytorch.graph.rnn import SRNN2
from edgeml_pytorch.trainer.srnnTrainer import SRNNTrainer
import edgeml_pytorch.utils as utils

In [3]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
DATA_DIR = './GoogleSpeech/Extracted/'

In [4]:
x_train_, y_train = np.load(DATA_DIR + 'x_train.npy'), np.load(DATA_DIR + 'y_train.npy')
x_val_, y_val = np.load(DATA_DIR + 'x_val.npy'), np.load(DATA_DIR + 'y_val.npy')
x_test_, y_test = np.load(DATA_DIR + 'x_test.npy'), np.load(DATA_DIR + 'y_test.npy')
# Mean-var normalize
mean = np.mean(np.reshape(x_train_, [-1, x_train_.shape[-1]]), axis=0)
std = np.std(np.reshape(x_train_, [-1, x_train_.shape[-1]]), axis=0)
std[std[:] < 0.000001] = 1
x_train_ = (x_train_ - mean) / std
x_val_ = (x_val_ - mean) / std
x_test_ = (x_test_ - mean) / std

x_train = np.swapaxes(x_train_, 0, 1)
x_val = np.swapaxes(x_val_, 0, 1)
x_test = np.swapaxes(x_test_, 0, 1)
print("Train shape", x_train.shape, y_train.shape)
print("Val shape", x_val.shape, y_val.shape)
print("Test shape", x_test.shape, y_test.shape)

Train shape (99, 51088, 32) (51088, 13)
Val shape (99, 6798, 32) (6798, 13)
Test shape (99, 6835, 32) (6835, 13)


In [5]:
numTimeSteps = x_train.shape[0]
numInput = x_train.shape[-1]
numClasses = y_train.shape[1]

# Network Parameters
brickSize = 11
hiddenDim0 = 64
hiddenDim1 = 32
cellType = 'LSTM'
learningRate = 0.01
batchSize = 128
epochs = 10

In [6]:
srnn2 = SRNN2(numInput, numClasses, hiddenDim0, hiddenDim1, cellType).to(device) 
trainer = SRNNTrainer(srnn2, learningRate, lossType='xentropy', device=device)

Using x-entropy loss


In [7]:
trainer.train(brickSize, batchSize, epochs, x_train, x_val, y_train, y_val, printStep=200, valStep=5)

Epoch 0 batch 0 loss 4.295151 acc 0.031250
Epoch 0 batch 200 loss 1.002617 acc 0.718750
Epoch 1 batch 0 loss 0.647069 acc 0.796875
Epoch 1 batch 200 loss 0.469229 acc 0.835938
Epoch 2 batch 0 loss 0.388671 acc 0.882812
Epoch 2 batch 200 loss 0.396696 acc 0.859375
Epoch 3 batch 0 loss 0.266433 acc 0.921875
Epoch 3 batch 200 loss 0.281694 acc 0.867188
Epoch 4 batch 0 loss 0.302240 acc 0.906250
Epoch 4 batch 200 loss 0.245797 acc 0.929688
Validation accuracy: 0.911003
Epoch 5 batch 0 loss 0.202542 acc 0.945312
Epoch 5 batch 200 loss 0.192004 acc 0.929688
Epoch 6 batch 0 loss 0.256735 acc 0.921875
Epoch 6 batch 200 loss 0.279066 acc 0.921875
Epoch 7 batch 0 loss 0.228837 acc 0.945312
Epoch 7 batch 200 loss 0.222357 acc 0.937500
Epoch 8 batch 0 loss 0.164639 acc 0.960938
Epoch 8 batch 200 loss 0.160117 acc 0.945312
Epoch 9 batch 0 loss 0.173849 acc 0.953125
Epoch 9 batch 200 loss 0.201694 acc 0.929688
Validation accuracy: 0.912474
