### We have our dataset in the form of .csv file finally
### So now we need to train a CNN to predict the liveliness of the audio clip

In [2]:
import numpy as np
import pandas as pd
from keras.models import Sequential,load_model
from keras.layers import Dense,Conv2D,MaxPooling2D,BatchNormalization,Flatten,Activation,Dropout
from keras.callbacks import ModelCheckpoint
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder

Using TensorFlow backend.


### here we mounted the dataset in google drive into the colaboratory of google

In [5]:
from google.colab import drive
drive.mount('/content/gdrive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/gdrive


### then we red the dataset in the form of dataframe

In [0]:
data = pd.read_csv("gdrive/My Drive/ml_projects/data (1).csv")

### here we load the numpy files of log-mel-spectrograms into 3 lists

In [0]:
good_logmels = np.load("gdrive/My Drive/ml_projects/good_logmels (1).npy")
bad_logmels = np.load("gdrive/My Drive/ml_projects/bad_logmels (1).npy")
avg_logmels = np.load("gdrive/My Drive/ml_projects/avg_logmels (1).npy")

In [0]:
good_list = list(good_logmels)
bad_list = list(bad_logmels)
avg_list = list(avg_logmels)

### here we concatenated the 3 lists of log-mel-specs

In [0]:
#convert the log mels to list of arrays
logmels = good_list + bad_list + avg_list

### now we add a new column to our dataframe that is 'logmels' to store log-mel-specs of each audio file

In [0]:
data['logmels'] = logmels

### here we shuffle the dataset

In [0]:
#shuffle data
data = data.sample(frac = 1)

### now we split the dataset into 90% of training and 10% of testing data

In [0]:
#split into train and test(10%)
training_data = data.iloc[:int(.9 * len(data))]
testing_data = data.iloc[int(.9 * len(data)):]

In [0]:
def getXY(df):
    X = np.stack(df.logmels)
    X = X.reshape(len(df),128,216,1)
    Y = np.array(df['class'])
    return X,Y

In [0]:
trainX,trainY = getXY(training_data)
testX,testY = getXY(testing_data)

### here we encode the labels good-bad-average into 0-1-2

In [0]:
#encode labels
le = LabelEncoder()
trainY = le.fit_transform(trainY)
testY = le.fit_transform(testY)

### then we one hot encode the labels to bring them in binary form

In [17]:
#one hot encode labels
ohe = OneHotEncoder(sparse = False)
trainY = ohe.fit_transform(trainY.reshape(len(trainY),1))
testY = ohe.fit_transform(testY.reshape(len(testY),1))

In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.
In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.


### here we defined the architecture of our CNN, which we are going to use to train our model
### first we put a convolution layer with 32 filters with each filter of size 5*5
### the we put a Batchnormaliation layer
### then we put an activation layer with 'relu' activation function
### the we put a max pooling layer with pool size of 2*2
### then convolutin layer with 64 filters
### then batch normaliztin
### then relu activation
### then max pooling
### then a Flatten layer to reshape the activation map into 1-d array
### then a fully connected Dense layer with 128 neurons
### then a batch normalization layer
### then a activation layer with 'relu' activation
### then a final output dense layer with 3 neurons and 'softmax' activation

In [18]:
#build the model
model = Sequential()
#model.add(Dropout(0.2,input_shape = (128,216,1)))
model.add(Conv2D(filters = 32,kernel_size = 5,strides = 2,input_shape = (128,216,1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
#model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size = 2))
          
model.add(Conv2D(filters = 64,kernel_size = 5,strides = 2,input_shape = (128,216,1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
#model.add(Dropout(0.5))
model.add(MaxPooling2D(pool_size = 2))

model.add(Flatten())

model.add(Dense(units = 128))
model.add(BatchNormalization())
model.add(Activation('relu'))
#model.add(Dropout(0.5))
model.add(Dense(units = 3,activation = 'softmax'))

Instructions for updating:
Colocations handled automatically by placer.


### here we compile our above defined cnn model with 'adam' optimizer
### loss function used is 'categorical crossentropy'

In [0]:
model.compile(optimizer = 'adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])

### finally we trainined our model with batch size of 32,
### and we held out 10% of validation data from training data
### we saved best weights on the basis of validation accuracy at each epoch

In [20]:
filepath="try7-weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
model.fit(trainX,trainY,batch_size = 32,epochs = 500,verbose = 1,validation_split = 0.1, callbacks=callbacks_list)

Instructions for updating:
Use tf.cast instead.
Train on 2583 samples, validate on 288 samples
Epoch 1/500

Epoch 00001: val_acc improved from -inf to 0.61458, saving model to try7-weights-improvement-01-0.61.hdf5
Epoch 2/500

Epoch 00002: val_acc improved from 0.61458 to 0.73264, saving model to try7-weights-improvement-02-0.73.hdf5
Epoch 3/500

Epoch 00003: val_acc improved from 0.73264 to 0.74653, saving model to try7-weights-improvement-03-0.75.hdf5
Epoch 4/500

Epoch 00004: val_acc did not improve from 0.74653
Epoch 5/500

Epoch 00005: val_acc improved from 0.74653 to 0.83681, saving model to try7-weights-improvement-05-0.84.hdf5
Epoch 6/500

Epoch 00006: val_acc did not improve from 0.83681
Epoch 7/500

Epoch 00007: val_acc improved from 0.83681 to 0.89236, saving model to try7-weights-improvement-07-0.89.hdf5
Epoch 8/500

Epoch 00008: val_acc did not improve from 0.89236
Epoch 9/500

Epoch 00009: val_acc did not improve from 0.89236
Epoch 10/500

Epoch 00010: val_acc did not imp

KeyboardInterrupt: ignored

In [0]:
model = load_model("try7-weights-improvement-87-0.96.hdf5")

### here we evaluated our model on testing data and achieved an accuracy of 95.6 %

In [22]:
model.evaluate(testX,testY,batch_size = 32,verbose = 1)



[0.19681823286330064, 0.9561128498618505]