# Deep Learning in Audio Classification in Python

## Data Preprocessing and RNN Model Building and Saving Model

```
What's the Execution Plan?
- The data is in the directory Dataset
  - further in the directories: 'Train' 'Test' and 'Validation'
- Each Set has two directories named by the dataset classes

What's the dataset Size?
- Its Big !!!

Is it Big Data Problem?
- Yes

Do I have resources to use hadoop/aws?
- no, I'm in Lockdown and limited time and knowledge is a concern for me!!

What's the solution?
- Have to use my old Intel i3 core :/ laptop to do some basic template
- once I get internet access, I'll use the template to run on Google's Colab =')
- After debugging, I'll increase the full dataset and re-run the program files for visualizaton, model training :O
```

### Importing Libraries

In [1]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile

In [2]:
from python_speech_features import mfcc

In [3]:
from tqdm import tqdm

In [5]:
from keras.layers import Conv2D, MaxPool2D, Flatten, Dropout, Dense
from keras.layers import LSTM, TimeDistributed

from keras.models import Sequential

from keras.utils import to_categorical

from sklearn.utils.class_weight import compute_class_weight

In [6]:
import pickle

from keras.callbacks import ModelCheckpoint

### User Defined

In [7]:
class Config:
    def __init__(self, mode= 'conv', nfilt=26, nfeat=13, nfft = 2048, rate = 16000):
        self.mode = mode
        self.nfilt = nfilt
        self.nfeat = nfeat
        self.nfft = nfft
        self.rate = rate
        self.step = int(rate/10)
        self.model_path = os.path.join('models', mode + '.model')
        self.p_path = os.path.join('pickles', mode + '.p')

In [8]:
def check_data():
    if os.path.isfile(config.p_path):
        print('Loading existing data for {} model'.format(config.mode))
        with open(config.p_path, 'rb') as handle:
            tmp = pickle.load(handle)
            return tmp
    else:
        return None

In [9]:
def build_rand_feat():
    tmp = check_data()
    if tmp:
        return tmp.data[0], tmp.data[1]
        
    X = []
    y = []
    
    _min, _max = float('inf'), -float('inf')
    
    for _ in tqdm(range(n_samples)):
        
        rand_class = np.random.choice(class_dist.index, p = prob_dist)
        
        file = np.random.choice(df[df.Class==rand_class].index)
        
        rate, wav = wavfile.read(dataset_directory+str(rand_class)+"/"+str(file))
        Class = df.at[file, 'Class']
        
        rand_index = np.random.randint(0, wav.shape[0]-config.step)
        
        sample = wav[rand_index : rand_index + config.step]
        X_sample = mfcc(sample, rate, numcep=config.nfeat, nfilt=config.nfilt, nfft=config.nfft)
        
        _min = min(np.amin(X_sample), _min)
        _max = max(np.amax(X_sample), _max)
        
        X.append(X_sample)
        y.append(classes.index(Class))
        
    
    config.min = _min
    config.max = _max
    
    X, y = np.array(X), np.array(y)
    X = (X- _min) / (_max - _min)
    
    if config.mode == 'conv':
        X = X.reshape(X.shape[0], X.shape[1], X.shape[2], 1)
    elif config.mode =='time':
        X = X.reshape(X.shape[0], X.shape[1], X.shape[2])
    
    y = to_categorical(y, num_classes=2)
    
    config.data = (X, y)
    
    with open(config.p_path, 'wb') as handle:
        pickle.dump(config, handle, protocol=2)
    
    return X,y

In [10]:
def get_reccurent_model():
    ### Shape of data for RNN is (n, time, freq)
    model = Sequential()
    
    model.add(LSTM(128, return_sequences=True, input_shape=input_shape))
    model.add(LSTM(128, return_sequences=True))
    
    model.add(TimeDistributed(Dense(64, activation='relu')))
    model.add(TimeDistributed(Dense(32, activation='relu')))
    model.add(TimeDistributed(Dense(16, activation='relu')))
    
    model.add(Flatten())
    model.add(Dropout(0.5))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(2, activation='sigmoid'))
    model.summary()
    model.compile(loss = 'binary_crossentropy', optimizer = 'adam', metrics=['acc'])
    
    return model

## Data Extraction

In [11]:
os.listdir('Temp_Dataset/')

['test', 'train', 'validation']

In [12]:
classes = list(os.listdir('Dataset/train/'))

print("Number of Classes in the Data Set:", len(classes), "Classes")
print("The classes of the dataset are   :", classes[0], ",", classes[1])

Number of Classes in the Data Set: 2 Classes
The classes of the dataset are   : not_sick , sick


#### Creating the dataframe with basic column names

In [13]:
column_names = ['Fname','Class', 'Length']
df = pd.DataFrame(columns = column_names)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 0 entries
Data columns (total 3 columns):
Fname     0 non-null object
Class     0 non-null object
Length    0 non-null object
dtypes: object(3)
memory usage: 0.0+ bytes


In [14]:
# dataset_directory = 'Dataset/Train/'
dataset_directory = 'Temp_Dataset/train/'

In [15]:
for c in list(classes):
    print('Number of files in the directory \'{}\' are {}'.format(c,len(os.listdir(dataset_directory+c))))

Number of files in the directory 'not_sick' are 385
Number of files in the directory 'sick' are 313


In [16]:
for c in list(classes):
    for n,f in tqdm(enumerate(os.listdir(dataset_directory+c))):
        rate, signal = wavfile.read(dataset_directory+str(c)+"/"+str(f))
        length = signal.shape[0]/rate
        f_df = pd.DataFrame({
            "Fname": str(f),
            "Class": str(c),
            "Length": length}, index = [n])
        df = df.append(f_df)

385it [00:10, 35.03it/s]
313it [00:09, 34.20it/s]


In [17]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 698 entries, 0 to 312
Data columns (total 3 columns):
Fname     698 non-null object
Class     698 non-null object
Length    698 non-null float64
dtypes: float64(1), object(2)
memory usage: 21.8+ KB


In [18]:
class_dist = df.groupby(['Class'])['Length'].mean()
class_dist

Class
not_sick    4.999960
sick        5.000198
Name: Length, dtype: float64

In [19]:
df.set_index('Fname', inplace=True)
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 698 entries, 02Sa2hBL_48_25_30.wav to _zrAnhgYzSo_15_20.wav
Data columns (total 2 columns):
Class     698 non-null object
Length    698 non-null float64
dtypes: float64(1), object(1)
memory usage: 16.4+ KB


# RNN Model using LSTM

In [20]:
n_samples = 2 * int(df['Length'].sum()/0.1)
prob_dist = class_dist / class_dist.sum()
choices = np.random.choice(class_dist.index, p= prob_dist)

In [21]:
config = Config(mode = 'time')
config

<__main__.Config at 0x988b768a58>

In [22]:
X,y = build_rand_feat()

100%|███████████████████████████████████| 69800/69800 [08:05<00:00, 143.68it/s]


In [23]:
y_flat = np.argmax(y, axis =1)

In [24]:
input_shape = (X.shape[1], X.shape[2])

In [25]:
model = get_reccurent_model()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 3, 128)            72704     
_________________________________________________________________
lstm_2 (LSTM)                (None, 3, 128)            131584    
_________________________________________________________________
time_distributed_1 (TimeDist (None, 3, 64)             8256      
_________________________________________________________________
time_distributed_2 (TimeDist (None, 3, 32)             2080      
_________________________________________________________________
time_distributed_3 (TimeDist (None, 3, 16)             528       
_________________________________________________________________
flatten_1 (Flatten)          (None, 48)                0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 48)                0         
__________

## Adding Checkpoints

In [26]:
checkpoint = ModelCheckpoint(config.model_path, monitor='val_acc', verbose=1, mode='max',
                            save_best_only=True, save_weights_only=False, period=1)

In [27]:
model.fit(X, y, epochs=250, batch_size=32, shuffle = True, validation_split=0.1, callbacks=[checkpoint])

Train on 62820 samples, validate on 6980 samples
Epoch 1/250

Epoch 00001: val_acc improved from -inf to 0.53073, saving model to models\time.model
Epoch 2/250

Epoch 00002: val_acc did not improve from 0.53073
Epoch 3/250

Epoch 00003: val_acc improved from 0.53073 to 0.54069, saving model to models\time.model
Epoch 4/250

Epoch 00004: val_acc improved from 0.54069 to 0.55401, saving model to models\time.model
Epoch 5/250

Epoch 00005: val_acc did not improve from 0.55401
Epoch 6/250

Epoch 00006: val_acc improved from 0.55401 to 0.55781, saving model to models\time.model
Epoch 7/250

Epoch 00007: val_acc did not improve from 0.55781
Epoch 8/250

Epoch 00008: val_acc improved from 0.55781 to 0.56884, saving model to models\time.model
Epoch 9/250

Epoch 00009: val_acc improved from 0.56884 to 0.57966, saving model to models\time.model
Epoch 10/250

Epoch 00010: val_acc improved from 0.57966 to 0.58983, saving model to models\time.model
Epoch 11/250

Epoch 00011: val_acc improved from 0


Epoch 00038: val_acc improved from 0.71855 to 0.72256, saving model to models\time.model
Epoch 39/250

Epoch 00039: val_acc improved from 0.72256 to 0.72822, saving model to models\time.model
Epoch 40/250

Epoch 00040: val_acc improved from 0.72822 to 0.73245, saving model to models\time.model
Epoch 41/250

Epoch 00041: val_acc did not improve from 0.73245
Epoch 42/250

Epoch 00042: val_acc improved from 0.73245 to 0.73861, saving model to models\time.model
Epoch 43/250

Epoch 00043: val_acc did not improve from 0.73861
Epoch 44/250

Epoch 00044: val_acc improved from 0.73861 to 0.74542, saving model to models\time.model
Epoch 45/250

Epoch 00045: val_acc did not improve from 0.74542
Epoch 46/250

Epoch 00046: val_acc improved from 0.74542 to 0.74943, saving model to models\time.model
Epoch 47/250

Epoch 00047: val_acc did not improve from 0.74943
Epoch 48/250

Epoch 00048: val_acc did not improve from 0.74943
Epoch 49/250

Epoch 00049: val_acc did not improve from 0.74943
Epoch 50/25


Epoch 00079: val_acc did not improve from 0.76590
Epoch 80/250

Epoch 00080: val_acc improved from 0.76590 to 0.77142, saving model to models\time.model
Epoch 81/250

Epoch 00081: val_acc did not improve from 0.77142
Epoch 82/250

Epoch 00082: val_acc did not improve from 0.77142
Epoch 83/250

Epoch 00083: val_acc did not improve from 0.77142
Epoch 84/250

Epoch 00084: val_acc improved from 0.77142 to 0.77371, saving model to models\time.model
Epoch 85/250

Epoch 00085: val_acc did not improve from 0.77371
Epoch 86/250

Epoch 00086: val_acc did not improve from 0.77371
Epoch 87/250

Epoch 00087: val_acc did not improve from 0.77371
Epoch 88/250

Epoch 00088: val_acc did not improve from 0.77371
Epoch 89/250

Epoch 00089: val_acc did not improve from 0.77371
Epoch 90/250

Epoch 00090: val_acc improved from 0.77371 to 0.77521, saving model to models\time.model
Epoch 91/250

Epoch 00091: val_acc did not improve from 0.77521
Epoch 92/250

Epoch 00092: val_acc improved from 0.77521 to 0.78


Epoch 00120: val_acc improved from 0.78911 to 0.78940, saving model to models\time.model
Epoch 121/250

Epoch 00121: val_acc did not improve from 0.78940
Epoch 122/250

Epoch 00122: val_acc did not improve from 0.78940
Epoch 123/250

Epoch 00123: val_acc did not improve from 0.78940
Epoch 124/250

Epoch 00124: val_acc did not improve from 0.78940
Epoch 125/250

Epoch 00125: val_acc did not improve from 0.78940
Epoch 126/250

Epoch 00126: val_acc did not improve from 0.78940
Epoch 127/250

Epoch 00127: val_acc did not improve from 0.78940
Epoch 128/250

Epoch 00128: val_acc improved from 0.78940 to 0.79298, saving model to models\time.model
Epoch 129/250

Epoch 00129: val_acc did not improve from 0.79298
Epoch 130/250

Epoch 00130: val_acc did not improve from 0.79298
Epoch 131/250

Epoch 00131: val_acc improved from 0.79298 to 0.79628, saving model to models\time.model
Epoch 132/250

Epoch 00132: val_acc did not improve from 0.79628
Epoch 133/250

Epoch 00133: val_acc did not improve 


Epoch 00161: val_acc did not improve from 0.79864
Epoch 162/250

Epoch 00162: val_acc did not improve from 0.79864
Epoch 163/250

Epoch 00163: val_acc did not improve from 0.79864
Epoch 164/250

Epoch 00164: val_acc did not improve from 0.79864
Epoch 165/250

Epoch 00165: val_acc did not improve from 0.79864
Epoch 166/250

Epoch 00166: val_acc did not improve from 0.79864
Epoch 167/250

Epoch 00167: val_acc did not improve from 0.79864
Epoch 168/250

Epoch 00168: val_acc did not improve from 0.79864
Epoch 169/250

Epoch 00169: val_acc did not improve from 0.79864
Epoch 170/250

Epoch 00170: val_acc did not improve from 0.79864
Epoch 171/250

Epoch 00171: val_acc did not improve from 0.79864
Epoch 172/250

Epoch 00172: val_acc did not improve from 0.79864
Epoch 173/250

Epoch 00173: val_acc did not improve from 0.79864
Epoch 174/250

Epoch 00174: val_acc did not improve from 0.79864
Epoch 175/250

Epoch 00175: val_acc did not improve from 0.79864
Epoch 176/250

Epoch 00176: val_acc did


Epoch 00204: val_acc did not improve from 0.79864
Epoch 205/250

Epoch 00205: val_acc did not improve from 0.79864
Epoch 206/250

Epoch 00206: val_acc did not improve from 0.79864
Epoch 207/250

Epoch 00207: val_acc did not improve from 0.79864
Epoch 208/250

Epoch 00208: val_acc did not improve from 0.79864
Epoch 209/250

Epoch 00209: val_acc did not improve from 0.79864
Epoch 210/250

Epoch 00210: val_acc did not improve from 0.79864
Epoch 211/250

Epoch 00211: val_acc did not improve from 0.79864
Epoch 212/250

Epoch 00212: val_acc did not improve from 0.79864
Epoch 213/250

Epoch 00213: val_acc did not improve from 0.79864
Epoch 214/250

Epoch 00214: val_acc improved from 0.79864 to 0.80193, saving model to models\time.model
Epoch 215/250

Epoch 00215: val_acc did not improve from 0.80193
Epoch 216/250

Epoch 00216: val_acc improved from 0.80193 to 0.80881, saving model to models\time.model
Epoch 217/250

Epoch 00217: val_acc did not improve from 0.80881
Epoch 218/250

Epoch 00218


Epoch 00246: val_acc did not improve from 0.80881
Epoch 247/250

Epoch 00247: val_acc did not improve from 0.80881
Epoch 248/250

Epoch 00248: val_acc did not improve from 0.80881
Epoch 249/250

Epoch 00249: val_acc did not improve from 0.80881
Epoch 250/250

Epoch 00250: val_acc did not improve from 0.80881


<keras.callbacks.History at 0x98907dbd68>

In [None]:
fig, axes = plt.subplots(nrows=1, ncols=1, sharex=False, sharey=True, figsize=(20,8))

# Plot accuracy per iteration
plt.plot(model.history.history['acc'][:50], label='acc')
plt.plot(model.history.history['val_acc'][:50], label='val_acc')
plt.legend()

plt.title('Custom Built LSTM RNN Model\'s Training Analysis on the sickness and non-sickness Audio Data', size=16)
plt.xlabel("Epochs")
plt.ylabel("accuracy reached")

plt.show()