# Audio Classification
This notebook will build a sample DNN to classify different class of audios

## Get features
In this notebook, I will not use all of features because it will be a 2D image(the input of DNN will be too many). So I will do average for all amplitude of same frequency

In [None]:
import numpy as np
def extract_features(file_name):
    try:
        audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
        mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=120)
        mfccsscaled = np.mean(mfccs.T,axis=0)
    except Exception as e:
        print("Error encountered while parsing file: ", file_name)
        return None 
    return mfccsscaled

Error: Failed to connect to remote Jupyter notebook.
Check that the Jupyter Server URI setting has a valid running server specified.
http://114.34.48.196:8888/
TypeError: request to http://114.34.48.196:8888/api/contents/?1583486875079 failed, reason: connect ECONNREFUSED 114.34.48.196:8888

## Read dataset and get all features of all audios
1. Read the csv file

loop

2. get and save features of the audio of the row

3. save the label

Look the following code:

In [14]:
import pandas as pd
import os
import librosa

# Set the path to the full audio dataset 
fulldatasetpath = './ESC-50/audio/'

metadata = pd.read_csv('./ESC-50/meta/esc50.csv')

features = []

# Iterate through each sound file and extract the features 
for index, row in metadata.iterrows():
    file_name = os.path.join(os.path.abspath(fulldatasetpath),str(row["filename"]))
    class_label = row["category"]
    data = extract_features(file_name)
    features.append([data, class_label])

# Convert into a Panda dataframe 
featuresdf = pd.DataFrame(features, columns=['feature','class_label'])

print('Finished feature extraction from ', len(featuresdf), ' files')

Finished feature extraction from  2000  files


## Convert the data and labels
I will use sklearn.preprocessing.LabelEncoder to encode the categorical text data into model-understandable numerical data.

In [15]:
from sklearn.preprocessing import LabelEncoder
from keras.utils import to_categorical

# Convert features and corresponding classification labels into numpy arrays
X = np.array(featuresdf.feature.tolist())
y = np.array(featuresdf.class_label.tolist())

# Encode the classification labels
le = LabelEncoder()
yy = to_categorical(le.fit_transform(y))

## Split the dataset
Here I will use sklearn.model_selection.train_test_split to split the dataset into training and testing sets. The testing set size will be 20% and we will set a random state.

In [16]:
# split the dataset 
from sklearn.model_selection import train_test_split 

x_train, x_test, y_train, y_test = train_test_split(X, yy, test_size=0.2, random_state = 42)

## Build DNN model
Look the following code

In [17]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics 

num_labels = yy.shape[1]
filter_size = 2

# Construct model 
model = Sequential()

model.add(Dense(256, input_shape=(120,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(num_labels))
model.add(Activation('softmax'))

In [18]:
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

In [19]:
# Display model architecture summary 
model.summary()

# Calculate pre-training accuracy 
score = model.evaluate(x_test, y_test, verbose=0)
accuracy = 100 * score[1]

print("Pre-training accuracy: %.4f%%" % accuracy)

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 256)               30976     
_________________________________________________________________
activation_4 (Activation)    (None, 256)               0         
_________________________________________________________________
dropout_3 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 256)               65792     
_________________________________________________________________
activation_5 (Activation)    (None, 256)               0         
_________________________________________________________________
dropout_4 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 50)               

## Train and save model

In [20]:
from keras.callbacks import ModelCheckpoint 
from datetime import datetime 

num_epochs = 2000
num_batch_size = 32

# save model check point to the address
checkpointer = ModelCheckpoint(filepath='save_models/weights.best.basic_mlp.hdf5', 
                               verbose=1, save_best_only=True)
start = datetime.now()

print(x_train.shape)

model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs, validation_data=(x_test, y_test), callbacks=[checkpointer], verbose=1)


duration = datetime.now() - start
print("Training completed in time: ", duration)


Epoch 01902: val_loss did not improve from 2.27184
Epoch 1903/2000

Epoch 01903: val_loss did not improve from 2.27184
Epoch 1904/2000

Epoch 01904: val_loss did not improve from 2.27184
Epoch 1905/2000

Epoch 01905: val_loss did not improve from 2.27184
Epoch 1906/2000

Epoch 01906: val_loss did not improve from 2.27184
Epoch 1907/2000

Epoch 01907: val_loss did not improve from 2.27184
Epoch 1908/2000

Epoch 01908: val_loss did not improve from 2.27184
Epoch 1909/2000

Epoch 01909: val_loss did not improve from 2.27184
Epoch 1910/2000

Epoch 01910: val_loss did not improve from 2.27184
Epoch 1911/2000

Epoch 01911: val_loss did not improve from 2.27184
Epoch 1912/2000

Epoch 01912: val_loss did not improve from 2.27184
Epoch 1913/2000

Epoch 01913: val_loss did not improve from 2.27184
Epoch 1914/2000

Epoch 01914: val_loss did not improve from 2.27184
Epoch 1915/2000

Epoch 01915: val_loss did not improve from 2.27184
Epoch 1916/2000

Epoch 01916: val_loss did not improve from 2.27

## Evalue training data and testing data

In [21]:
# Evaluating the model on the training and testing set
score = model.evaluate(x_train, y_train, verbose=0)
print("Training Accuracy: ", score[1])

score = model.evaluate(x_test, y_test, verbose=0)
print("Testing Accuracy: ", score[1])

Training Accuracy:  0.9818750023841858
Testing Accuracy:  0.47999998927116394


## Observations
The performance of this model is not satisfactory. I think maybe I need to use all features and build cnn model to train.