# LSTM RNN Model

This notebook is used to train a LSTM RNN model for All class classification.

### Importing Required Libraries

This section will import the required libaries that will be used to actually implement the training for the Vanilla RNN Binary Classification Model

In [1]:
import numpy as np
import pandas as pd
import os
import glob
from keras.models import Sequential
from keras.layers import GRU, Dense, Input
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.callbacks import ModelCheckpoint, TensorBoard, EarlyStopping
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
import time

Adding this so that can use the python scripts for loading the training data

In [2]:
import sys
nb_dir = os.path.split(os.getcwd())[0]
if nb_dir not in sys.path:
    sys.path.append(nb_dir)

Importing the python script training data loader function. This function loads the data from the *.wav* files directly

In [3]:
from py_scripts.directory_funcs import *
from py_scripts.wav_file_funcs import *
from py_scripts.misc_audio_signal_funcs import *
from py_scripts.raw_training_data_creation import load_irmas_data

### Loading Training Data

This section uses the script from the training data creation module to load the training data.

In [4]:
# First define the training data directory, and find the different classes that exist
training_data_dir = '../../data/whole_dataset/training/'
get_subdirectory_names(training_data_dir)

['cel', 'cla', 'flu', 'gac', 'gel', 'org', 'pia', 'sax', 'tru', 'vio', 'voi']

In [5]:
# Define the classes that we will work
all_classes = get_subdirectory_names(training_data_dir)
# Defining the classes that will be the ones on which we will train for the project
classes_for_project = ['gel', 'pia', 'sax', 'vio', 'voi']
mapping_to_index = dict(zip(classes_for_project, range(len(classes_for_project))))
# Getting the frequency for each class
class_num_files = dict(zip(classes_for_project, [len(get_file_names(construct_path(training_data_dir, class_name), '*.wav')) for class_name in classes_for_project]))
# Defining the class which will be used as the one v/s all classifier to denote all other classes except the current class

In [6]:
# Defining the various parameters for loading the input data
rnn_window = (300, 300) # in the format of length of vector and the shift
class_num_examples = 200
X, y = load_irmas_data(training_data_dir, classes_for_project, rnn_window, number_of_training_examples_per_class=class_num_examples)

Getting Data from gel
Processing: 200 files
Loaded all the data from the class
Getting Data from pia
Processing: 200 files
Loaded all the data from the class
Getting Data from sax
Processing: 200 files
Loaded all the data from the class
Getting Data from vio
Processing: 200 files
Loaded all the data from the class
Getting Data from voi
Processing: 200 files
Loaded all the data from the class


In [7]:
# Verifying the shape
X.shape, y.shape

((4000, 294, 300), (4000,))

In [8]:
# Doing a train test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [9]:
# Freeing up previous memory
X = None
y = None

In [10]:
print(f'Total Number of Training Samples: {y_train.shape}')
print(f'Total number of timestamp values for each sample: {X_train.shape[1]}')
print(f'Total number of features for each sample: {X_train.shape[-1]}')
#print(f'Minimum Feature Value: {np.min(X_train)}, Maximum Feature Value: {np.max(X_train)}')

Total Number of Training Samples: (3200,)
Total number of timestamp values for each sample: 294
Total number of features for each sample: 300


In [11]:
print(f'Total Number of Testing Samples: {y_test.shape}')
print(f'Total number of timestamp values for each sample: {X_test.shape[1]}')
print(f'Total number of features for each sample: {X_test.shape[-1]}')
#print(f'Minimum Feature Value: {np.min(X_train)}, Maximum Feature Value: {np.max(X_train)}')

Total Number of Testing Samples: (800,)
Total number of timestamp values for each sample: 294
Total number of features for each sample: 300


In [12]:
y_train_categorical = pd.Categorical(y_train)
y_train_numerical = y_train_categorical.codes
y_test_categorical = pd.Categorical(y_test)
y_test_numerical = y_test_categorical.codes
# Checking the categries
print(y_train_categorical.categories, y_test_categorical.categories)

Index(['gel', 'pia', 'sax', 'vio', 'voi'], dtype='object') Index(['gel', 'pia', 'sax', 'vio', 'voi'], dtype='object')


In [13]:
# Find the number of categories
len(y_train_categorical.categories)

5

In [14]:
# If training for more than 1 class then need to convert to categorical
if len(y_train_categorical.categories) > 2:
    y_train_numerical = to_categorical(y_train_numerical)
    y_test_numerical = to_categorical(y_test_numerical)

In [15]:
y_train_numerical.shape

(3200, 5)

## Check For Memory Usage

In [16]:
import sys

# These are the usual ipython objects, including this one you are creating
ipython_vars = ['In', 'Out', 'exit', 'quit', 'get_ipython', 'ipython_vars']

# Get a sorted list of the objects and their sizes
sorted([(x, sys.getsizeof(globals().get(x))) for x in dir() if not x.startswith('_') and x not in sys.modules and x not in ipython_vars], key=lambda x: x[1], reverse=True)

[('X_train', 1128960128),
 ('X_test', 282240128),
 ('y_train', 38496),
 ('y_test', 9696),
 ('y_train_categorical', 3684),
 ('y_test_categorical', 1284),
 ('Dense', 1056),
 ('EarlyStopping', 1056),
 ('Embedding', 1056),
 ('GRU', 1056),
 ('ModelCheckpoint', 1056),
 ('Sequential', 1056),
 ('TensorBoard', 1056),
 ('nb_dir', 285),
 ('class_num_files', 240),
 ('mapping_to_index', 240),
 ('all_classes', 160),
 ('training_data_dir', 153),
 ('Input', 136),
 ('check_output_directory', 136),
 ('construct_path', 136),
 ('exist_directory', 136),
 ('exist_file', 136),
 ('get_directory_contents', 136),
 ('get_file_names', 136),
 ('get_left_channel_data', 136),
 ('get_right_channel_data', 136),
 ('get_sound_signals', 136),
 ('get_subdirectory_names', 136),
 ('load_irmas_data', 136),
 ('normalize_sound_signals', 136),
 ('read_wav_file', 136),
 ('shift_sound_signals', 136),
 ('to_categorical', 136),
 ('train_test_split', 136),
 ('y_test_numerical', 112),
 ('y_train_numerical', 112),
 ('classes_for_proje

## Model Definition

This section will define the model architecture that will be used for the training purposes

In [17]:
# Defining the parameters for the Embedding layer
number_of_features = X_train.shape[-1]
number_of_time_stamps = X_train.shape[1]
print(f'Number of Features (Feature Vector Length): {number_of_features}, Number of Time Stamps: {number_of_time_stamps}')

Number of Features (Feature Vector Length): 300, Number of Time Stamps: 294


In [18]:
# Define the model
rnn_layer_num_units = 50
num_classes_for_training = len(y_train_categorical.categories)
model = Sequential()
model.add(GRU(rnn_layer_num_units, input_shape=(number_of_time_stamps, number_of_features), dropout = 0.1))
model.add(Dense(num_classes_for_training, activation='softmax'))
# Compiling the model
model.compile(loss= 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
gru_1 (GRU)                  (None, 50)                52650     
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 255       
Total params: 52,905
Trainable params: 52,905
Non-trainable params: 0
_________________________________________________________________


In [21]:
# Adding a checkpoint
parent_weight_save_dir = '../../data/Training Results/GRU/Weights'
tensor_board_dir_path = '../../data/Training Results/GRU/TensorBoard'
check_output_directory(parent_weight_save_dir)
current_experiment_name = f'AllClasses_InputVectorLen-{number_of_features}_TimeStamps-{number_of_time_stamps}_CT-{time.time()}'
weight_file_path = os.path.join(parent_weight_save_dir, f'{current_experiment_name}.hdf5')
tensor_board_file_path = os.path.join(tensor_board_dir_path, current_experiment_name)
check_output_directory(tensor_board_file_path)
checkpoint = ModelCheckpoint(weight_file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
tensorboard = TensorBoard(log_dir=tensor_board_file_path)
early_stopping_criteria = EarlyStopping(monitor='val_acc', min_delta=0, patience=5, verbose=0, mode='auto')
callbacks_list = [tensorboard, checkpoint, early_stopping_criteria]

In [22]:
history = model.fit(X_train, y_train_numerical, epochs=50, batch_size=64, validation_split=0.2, callbacks=callbacks_list)

Train on 2560 samples, validate on 640 samples
Epoch 1/50

Epoch 00001: val_loss improved from -inf to 1.55070, saving model to ../../data/Training Results/GRU/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-1544926226.872478.hdf5
Epoch 2/50

Epoch 00002: val_loss did not improve from 1.55070
Epoch 3/50

Epoch 00003: val_loss did not improve from 1.55070
Epoch 4/50

Epoch 00004: val_loss did not improve from 1.55070
Epoch 5/50

Epoch 00005: val_loss did not improve from 1.55070
Epoch 6/50

KeyboardInterrupt: 

In [23]:
# Final evaluation of the model
scores = model.evaluate(X_test, y_test_numerical, verbose=1)
print("Accuracy: %.2f%%" % (scores[1]*100))

Accuracy: 47.10%


### Another Model Definition

In [24]:
# Define the model
rnn_layer_num_units = 100
num_classes_for_training = len(y_train_categorical.categories)
model = Sequential()
model.add(GRU(rnn_layer_num_units, input_shape=(number_of_time_stamps, number_of_features), dropout = 0.1))
model.add(Dense(num_classes_for_training, activation='softmax'))
# Compiling the model
model.compile(loss= 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_2 (LSTM)                (None, 100)               160400    
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 505       
Total params: 160,905
Trainable params: 160,905
Non-trainable params: 0
_________________________________________________________________


In [25]:
# Adding a checkpoint
parent_weight_save_dir = '../../data/Training Results/GRU/Weights'
tensor_board_dir_path = '../../data/Training Results/GRU/TensorBoard'
check_output_directory(parent_weight_save_dir)
current_experiment_name = f'AllClasses_InputVectorLen-{number_of_features}_TimeStamps-{number_of_time_stamps}_CT-{time.time()}'
weight_file_path = os.path.join(parent_weight_save_dir, f'{current_experiment_name}.hdf5')
tensor_board_file_path = os.path.join(tensor_board_dir_path, current_experiment_name)
check_output_directory(tensor_board_file_path)
checkpoint = ModelCheckpoint(weight_file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
tensorboard = TensorBoard(log_dir=tensor_board_file_path)
early_stopping_criteria = EarlyStopping(monitor='val_acc', min_delta=0, patience=5, verbose=0, mode='auto')
callbacks_list = [tensorboard, checkpoint, early_stopping_criteria]

In [26]:
history = model.fit(X_train, y_train_numerical, epochs=50, batch_size=64, validation_split=0.2, callbacks=callbacks_list)

Train on 3200 samples, validate on 800 samples
Epoch 1/50

Epoch 00001: val_acc improved from -inf to 0.36375, saving model to ../../data/Training Results/LSTM/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-1544163216.7849946.hdf5
Epoch 2/50

Epoch 00002: val_acc improved from 0.36375 to 0.37375, saving model to ../../data/Training Results/LSTM/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-1544163216.7849946.hdf5
Epoch 3/50

Epoch 00003: val_acc improved from 0.37375 to 0.37875, saving model to ../../data/Training Results/LSTM/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-1544163216.7849946.hdf5
Epoch 4/50

Epoch 00004: val_acc improved from 0.37875 to 0.38000, saving model to ../../data/Training Results/LSTM/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-1544163216.7849946.hdf5
Epoch 5/50

Epoch 00005: val_acc improved from 0.38000 to 0.39250, saving model to ../../data/Training Results/LSTM/Weights\AllClasses_InputVectorLen-300_TimeStamps-294_CT-

In [27]:
# Final evaluation of the model
scores = model.evaluate(X_test, y_test_numerical, verbose=1)
print("Accuracy: %.2f%%" % (scores[1]*100))

Accuracy: 49.10%
