# Model Training

This notebook is a continuation of the model_selection page in which we attempted to train several prepackaged Scikit-Learn models on our feature data. None of the models we found were much more effective than random chance at predicting our features, if they were better at all. In the end we concluded that our system must be non-linear, thus in this noetbook we will be using Keras to build a deep learning solution to this system.

### Linear vs Non-Linear Optimization

All machine learning can be considered mathematical optimization. Even neural networks, though often spoken of as "simulations" of human brains, can only really be considered homages to organic brains. In our last notebook we were treating our data as a linear system, which would make machine learning (relatively) simple as each input has a single, clear output for the algorithm to attempt to estimate. It would have been very nice if this was the case, as a nonlinear system could have results that are not linearly seperable and are therefore not able to be solved with the same methods.

### Deep Learning

Deep neural networks have been used to model nonlinear systems and relationships in the past. They are particularly useful in image recognition, natural language processing, and more, and since we are using the Mel-spectrum to represent our music data we may be able to use the same systems here. Ultimately, a DNN is very similar to an ANN, with the exception that the DNN has many hidden layers between the input and output layers. Historically, convolutional neural networks have been used to solve genre classification problem in both single and multi-label contexts, so we will be exploring them here in combination with recurrent neural networks which are able to analyze time sequence data.

### Keras

Keras is a python API for building deep learning machine learning models with strong support for both convolutional and recurrent neural networks. It was built on top of the popular Theano library and effectively acts as a wrapper for it, however newer releases also allow for integration with Google's TensorFlow library instead. We will be using it to build our DNN below. It can be run on either CPU or GPU, we will be using the CPU option due to technical and budget limitations.

### Creation of DLGINN, the Deep Learning Genre Identification Neural Network

In [13]:
import numpy as np
import mysql.connector as dbc
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import ELU
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, ZeroPadding2D

We need to re import all of our data, so we will reuse the code from model_selection (see model_selection.ipynb for more information on the cell below)

In [2]:
datapath = '/home/seancrwhite/HDD/Data/fma/data/db/melgrams.csv'

data = np.loadtxt(datapath, delimiter=',')

print(data.shape)

(262400, 1291)


In [4]:
ids = np.reshape(data[:, 0], (1, 262400))
data = np.reshape(data[:, 1:], (2050, 128, 1290, 1))

In [5]:
u_ids = []
i = 0

for s_id in ids[0]:
    if i % 128 == 0:
        u_ids.append(int(s_id))
    i = i + 1

In [6]:
db = dbc.connect(port=3306,
                 user="root",
                 passwd="password",
                 db="SONG")
cursor = db.cursor()

labels = []
idxs = []

for s_id in u_ids: 
    query = "select * from SONG.GENRES where s_id={}".format(s_id)
    cursor.execute(query)
    
    row = cursor.fetchone()
    
    if row is None:
        idxs.append(u_ids.index(s_id))
    else:
        labels.append(row)

labels = np.array(labels)
labels = labels[:,1:]

In [7]:
idxs = sorted(idxs, reverse=True)

for idx in idxs:
    data = np.delete(data, idx, 0)

In [8]:
print(labels.shape)
print(data.shape)

(1973, 16)
(1973, 128, 1290, 1)


In [9]:
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.33, random_state=73)
print(X_train.shape)

(1321, 128, 1290, 1)


Now we can begin building our model, piece by piece, using the Keras Sequential object. Architecture of CNN outlined by Keunwoo Choi, George Fazekas, and Mark Sandler here: https://arxiv.org/abs/1606.00298

In [16]:
model = Sequential()

model.add(BatchNormalization(axis=1, input_shape=(128, 1290, 1), name='input'))

model.add(Conv2D(32, (3, 3), name='conv1'))
model.add(BatchNormalization(axis=3))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=(2, 4), name='pool1'))

model.add(Conv2D(32, (3, 3), name='conv2'))
model.add(BatchNormalization(axis=3))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=(3, 4), name='pool2'))

model.add(Conv2D(32, (3, 3), name='conv3'))
model.add(BatchNormalization(axis=3))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=(2, 5), name='pool3'))

model.add(Conv2D(32, (3, 3), name='conv4'))
model.add(BatchNormalization(axis=3))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=(2, 4), name='pool4'))

model.add(Conv2D(32, (3, 3), activation='elu', name='conv5'))
model.add(BatchNormalization(axis=3))
model.add(ELU(alpha=1.0))
model.add(MaxPooling2D(pool_size=(1, 1), name='pool5'))

model.add(Flatten())
model.add(Dense(16, activation='sigmoid', name='output'))

model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input (BatchNormalization)   (None, 128, 1290, 1)      512       
_________________________________________________________________
conv1 (Conv2D)               (None, 126, 1288, 32)     320       
_________________________________________________________________
batch_normalization_21 (Batc (None, 126, 1288, 32)     128       
_________________________________________________________________
elu_7 (ELU)                  (None, 126, 1288, 32)     0         
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 63, 322, 32)       0         
_________________________________________________________________
conv2 (Conv2D)               (None, 61, 320, 32)       9248      
_________________________________________________________________
batch_normalization_22 (Batc (None, 61, 320, 32)       128       
__________

### Evaluation

With a model in hand we can now train and evaluate it on our data. We will be using Scikit-Learn's train_test_split to make this a bit easier on ourselves.

In [None]:
model.fit(X_train, y_train, epochs=3)

#score_train = model.evaluate(X_train, y_train)
score_test = model.evaluate(X_test, y_test)

#print("Training Data Accuracy: {}".format(score_train[1]))
print("Test Data Accuracy: {}".format(score_test[1]))

Epoch 1/3
