# Vanilla Binary RNN Model

This notebook is used to train a simple Vanilla RNN model for Binary classification of Piano and Electric Guitar. We chose the following two classes because there numbers are really close to each other, hence would be able to solve the class imbalance issue beforehand and dont have to worry about it.

### Importing Required Libraries

This section will import the required libaries that will be used to actually implement the training for the Vanilla RNN Binary Classification Model

In [2]:
import numpy as np
import pandas as pd
import os
import glob
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense, Input
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.callbacks import ModelCheckpoint, TensorBoard, EarlyStopping
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
import time

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Adding this so that can use the python scripts for loading the training data

In [3]:
import sys
nb_dir = os.path.split(os.getcwd())[0]
if nb_dir not in sys.path:
    sys.path.append(nb_dir)

Importing the python script training data loader function. This function loads the data from the *.wav* files directly

In [4]:
from py_scripts.directory_funcs import *
from py_scripts.wav_file_funcs import *
from py_scripts.misc_audio_signal_funcs import *
from py_scripts.raw_training_data_creation import load_irmas_data

### Loading Training Data

This section uses the script from the training data creation module to load the training data.

In [5]:
# First define the training data directory, and find the different classes that exist
training_data_dir = '../../data/whole_dataset/training/'
get_subdirectory_names(training_data_dir)

['cel', 'cla', 'flu', 'gac', 'gel', 'org', 'pia', 'sax', 'tru', 'vio', 'voi']

In [6]:
# Define the classes that we will work
all_classes = get_subdirectory_names(training_data_dir)
# Defining the classes that will be the ones on which we will train for the project
classes_for_project = ['gel', 'pia', 'sax', 'vio', 'voi']
mapping_to_index = dict(zip(classes_for_project, range(len(classes_for_project))))
# Getting the frequency for each class
class_num_files = dict(zip(classes_for_project, [len(get_file_names(construct_path(training_data_dir, class_name), '*.wav')) for class_name in classes_for_project]))
# Defining the class which will be used as the one v/s all classifier to denote all other classes except the current class
one_vs_all_class_name = 'bad'

#### Loading data for the Primary Class

In [7]:
# Load the data for one class which we choose and then load some from each other class
current_class_name = 'pia'
# Defining the various parameters for loading the input data
rnn_window = (300, 300) # in the format of length of vector and the shift
current_class_num_examples = 0
current_class_X, current_class_y = load_irmas_data(training_data_dir, [current_class_name], 
                                                   rnn_window, number_of_training_examples_per_class=current_class_num_examples)

Getting Data from pia
Processing: 721 files
Loaded all the data from the class


In [8]:
# Verifying the shape of the current class data array
current_class_X.shape, current_class_y.shape

((2884, 294, 300), (2884,))

#### Loading the data for the residual classes

In [9]:
all_classes_num_examples = class_num_files[current_class_name] // (len(classes_for_project) - 1) \
                           if current_class_num_examples == 0 else current_class_num_examples // (len(classes_for_project) - 1)
all_class_X, all_class_y = load_irmas_data(training_data_dir, set(classes_for_project) - set([current_class_name]), 
                                                   rnn_window, number_of_training_examples_per_class=all_classes_num_examples)

Getting Data from sax
Processing: 180 files
Loaded all the data from the class
Getting Data from voi
Processing: 180 files
Loaded all the data from the class
Getting Data from gel
Processing: 180 files
Loaded all the data from the class
Getting Data from vio
Processing: 180 files
Loaded all the data from the class


In [10]:
# Verifying the shape of the all class data array
all_class_X.shape, all_class_y.shape

((2880, 294, 300), (2880,))

### Creating the Actual Training Data

In [11]:
# First changing the label of the all_class_y labels
all_class_y[:] = one_vs_all_class_name
# Vertically concatenating the two feature data arrays
X = np.vstack((current_class_X, all_class_X))
# Horizontally concatenating the label array
y = np.hstack((current_class_y, all_class_y))

In [12]:
# Free the memory allocated for the all_class_x, current_class_x, all_class_y, current_class_y
current_class_X = None
current_class_y = None
all_class_X = None
all_class_y = None

In [13]:
# Verifying the shape
X.shape, y.shape

((5764, 294, 300), (5764,))

In [14]:
# Doing a train test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [15]:
# Freeing up previous memory
X = None
y = None

In [16]:
print(f'Total Number of Training Samples: {y_train.shape}')
print(f'Total number of timestamp values for each sample: {X_train.shape[1]}')
print(f'Total number of features for each sample: {X_train.shape[-1]}')
#print(f'Minimum Feature Value: {np.min(X_train)}, Maximum Feature Value: {np.max(X_train)}')

Total Number of Training Samples: (4611,)
Total number of timestamp values for each sample: 294
Total number of features for each sample: 300


In [17]:
print(f'Total Number of Testing Samples: {y_test.shape}')
print(f'Total number of timestamp values for each sample: {X_test.shape[1]}')
print(f'Total number of features for each sample: {X_test.shape[-1]}')
#print(f'Minimum Feature Value: {np.min(X_train)}, Maximum Feature Value: {np.max(X_train)}')

Total Number of Testing Samples: (1153,)
Total number of timestamp values for each sample: 294
Total number of features for each sample: 300


In [18]:
y_train_categorical = pd.Categorical(y_train)
y_train_numerical = y_train_categorical.codes
y_test_categorical = pd.Categorical(y_test)
y_test_numerical = y_test_categorical.codes
# Checking the categries
print(y_train_categorical.categories, y_test_categorical.categories)

Index(['bad', 'pia'], dtype='object') Index(['bad', 'pia'], dtype='object')


In [19]:
# Find the number of categories
len(y_train_categorical.categories)

2

In [20]:
# If training for more than 1 class then need to convert to categorical
if len(y_train_categorical.categories) > 2:
    y_train_numerical = to_categorical(y_train_numerical)
    y_test_numerical = to_categorical(y_test_numerical)

## Check For Memory Usage

In [21]:
import sys

# These are the usual ipython objects, including this one you are creating
ipython_vars = ['In', 'Out', 'exit', 'quit', 'get_ipython', 'ipython_vars']

# Get a sorted list of the objects and their sizes
sorted([(x, sys.getsizeof(globals().get(x))) for x in dir() if not x.startswith('_') and x not in sys.modules and x not in ipython_vars], key=lambda x: x[1], reverse=True)

[('X_train', 1626760928),
 ('X_test', 406778528),
 ('y_train', 55428),
 ('y_test', 13932),
 ('y_train_categorical', 4835),
 ('y_test_categorical', 1377),
 ('Dense', 1056),
 ('EarlyStopping', 1056),
 ('Embedding', 1056),
 ('ModelCheckpoint', 1056),
 ('Sequential', 1056),
 ('SimpleRNN', 1056),
 ('TensorBoard', 1056),
 ('nb_dir', 285),
 ('class_num_files', 240),
 ('mapping_to_index', 240),
 ('all_classes', 160),
 ('training_data_dir', 153),
 ('Input', 136),
 ('check_output_directory', 136),
 ('construct_path', 136),
 ('exist_directory', 136),
 ('exist_file', 136),
 ('get_directory_contents', 136),
 ('get_file_names', 136),
 ('get_left_channel_data', 136),
 ('get_right_channel_data', 136),
 ('get_sound_signals', 136),
 ('get_subdirectory_names', 136),
 ('load_irmas_data', 136),
 ('normalize_sound_signals', 136),
 ('read_wav_file', 136),
 ('shift_sound_signals', 136),
 ('to_categorical', 136),
 ('train_test_split', 136),
 ('classes_for_project', 104),
 ('y_test_numerical', 96),
 ('y_train_n

## Model Definition

This section will define the model architecture that will be used for the training purposes

In [22]:
# Defining the parameters for the Embedding layer
number_of_features = X_train.shape[-1]
number_of_time_stamps = X_train.shape[1]
print(f'Number of Features (Feature Vector Length): {number_of_features}, Number of Time Stamps: {number_of_time_stamps}')

Number of Features (Feature Vector Length): 300, Number of Time Stamps: 294


In [23]:
# Define the model
rnn_layer_num_units = 50
num_classes_for_training = len(y_train_categorical.categories)
model = Sequential()
model.add(SimpleRNN(rnn_layer_num_units, input_shape=(number_of_time_stamps, number_of_features), dropout=0.2))
model.add(Dense(1 if num_classes_for_training < 3 else num_classes_for_training, activation='sigmoid' if num_classes_for_training < 3 else 'softmax'))
# Compiling the model
model.compile(loss='binary_crossentropy' if num_classes_for_training < 3 else'categorical_crossentropy', 
              optimizer='adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
simple_rnn_1 (SimpleRNN)     (None, 50)                17550     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 51        
Total params: 17,601
Trainable params: 17,601
Non-trainable params: 0
_________________________________________________________________


In [24]:
# Adding a checkpoint
parent_weight_save_dir = '../../data/Training Results/Vanilla Simple RNN/Weights'
tensor_board_dir_path = '../../data/Training Results/Vanilla Simple RNN/TensorBoard'
check_output_directory(parent_weight_save_dir)
current_experiment_name = f'OneClass-{current_class_name}_InputVectorLen-{number_of_features}_TimeStamps-{number_of_time_stamps}_CT-{time.time()}'
weight_file_path = os.path.join(parent_weight_save_dir, f'{current_experiment_name}.hdf5')
tensor_board_file_path = os.path.join(tensor_board_dir_path, current_experiment_name)
check_output_directory(tensor_board_file_path)
checkpoint = ModelCheckpoint(weight_file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
tensorboard = TensorBoard(log_dir=tensor_board_file_path)
early_stopping_criteria = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto')
callbacks_list = [tensorboard, checkpoint, early_stopping_criteria]

In [25]:
history = model.fit(X_train, y_train_numerical, epochs=50, batch_size=64, validation_split=0.2, callbacks=callbacks_list)

Train on 3688 samples, validate on 923 samples
Epoch 1/50

Epoch 00001: val_acc improved from -inf to 0.58397, saving model to ../../data/Training Results/Vanilla Simple RNN/Weights\OneClass-pia_InputVectorLen-300_TimeStamps-294_CT-1544147961.3491752.hdf5
Epoch 2/50

Epoch 00002: val_acc improved from 0.58397 to 0.58830, saving model to ../../data/Training Results/Vanilla Simple RNN/Weights\OneClass-pia_InputVectorLen-300_TimeStamps-294_CT-1544147961.3491752.hdf5
Epoch 3/50

Epoch 00003: val_acc improved from 0.58830 to 0.59697, saving model to ../../data/Training Results/Vanilla Simple RNN/Weights\OneClass-pia_InputVectorLen-300_TimeStamps-294_CT-1544147961.3491752.hdf5
Epoch 4/50

Epoch 00004: val_acc improved from 0.59697 to 0.60130, saving model to ../../data/Training Results/Vanilla Simple RNN/Weights\OneClass-pia_InputVectorLen-300_TimeStamps-294_CT-1544147961.3491752.hdf5
Epoch 5/50

Epoch 00005: val_acc improved from 0.60130 to 0.60563, saving model to ../../data/Training Resul

In [99]:
# Final evaluation of the model
scores = model.evaluate(X_test, y_test_numerical, verbose=1)
print("Accuracy: %.2f%%" % (scores[1]*100))

Accuracy: 58.72%
