### Changes from run-1

#### Test data
* Previously we used cross-validation to select the best model, and tested that model on data (opus 131) that we had kept apart from the beginning.
* The scores achieved when predicting on opus 131 were our final results.
* This time we train/validate on all data and take the final cross-validation scores as our results.

#### Crossvalidation
* Previously we simply took all sequences in the train/validate set, shuffled them and trained/validated with a 80/20 split.
* This time we instead shuffle the opuses (opi?) before generating the sequences, with the idea that this will be a more correct estimate of how patterns generalize across opuses with leave one (opus) out cross validation.

#### Input
* Previously we grouped similar chords together and grouped chords that appeared rarely (less than 10 times) under a single label.
* The idea was to remove outliers, reduce the output space and improve generalization
* As it was indicated that having the amount of output classes be dependent on the input was a bad idea we now use rules independent of the data for grouping (?)

#### Model
* Previously we had a bi-directional LSTM layer in the model architecture as it increased performance. For the sake of being able to compare the results to a simple N-gram model we decided to remove that layer in this iteration.

#### Hyperparameters
* Given the increased amount of outliers and the removal of the bidirectional layer we expect generalization accuracy to decrease.
* To remedy that we used the current model and iterated through different values for a regularization parameter, which we didn't explore previously.


In [1]:
#Imports
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
import seaborn as sns
sns.set()

from tensorflow.keras import *
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
from tensorflow.keras.callbacks import *

from chord_functions import *

from sklearn.metrics import *
from sklearn.model_selection import KFold

from collections import defaultdict

  from ._conv import register_converters as _register_converters


# Setup

In [2]:
# fix random seed for reproducibility
seed = 1
np.random.seed(seed)

#Load all data
data = pd.read_csv('data/R10.csv')

#Remove redundant attributes. Keep op to split into opuses
data = data[['chord', 'op']]

#Use dummy variable representation for the chords
data = pd.get_dummies(data)

# Model

In [3]:
def lstm(lstm_x, lstm_y, optimizer, loss, metrics, regstrength):
    model = Sequential()
    
    model.add(LSTM(256, return_sequences=True, input_shape=(lstm_x.shape[1], lstm_x.shape[2]),\
                   kernel_regularizer=regularizers.l2(regstrength)))
    
    model.add(Dropout(0.5))

    model.add(LSTM(64, return_sequences=False,\
              kernel_regularizer=regularizers.l2(regstrength)))
    
    model.add(Dropout(0.3))
    
    model.add(Dense(lstm_y.shape[1], activation='softmax', \
             kernel_regularizer=regularizers.l2(regstrength)))

    model.compile(loss=loss,
                  optimizer=optimizer,
                  metrics=metrics)
    return model

# Train/Test

### Select parameters for the learning process

In [55]:
optimizer = 'Adam'
loss = 'categorical_crossentropy'
metrics = ['accuracy']
epochs = 30
verbose = 2
seq_length = 10

#Save the weights whenever validation accuracy is increased
checkpoint = ModelCheckpoint(
    'weights.{epoch:02d}-{val_acc:.4f}.hdf5',
    monitor='val_acc', 
    verbose=0,        
    save_best_only=False
)
# Stop the learning process if we havent improved validation accuracy for 10 epochs
earlystop = EarlyStopping(monitor='val_acc', min_delta=0, patience=5, verbose=1)

#callbacks_list = [checkpoint, earlystop]   
callbacks_list = [earlystop]  

### Cross validate

In [58]:
#define the range of regularization strength to check
regstrength = [0, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5]

print("Start!")

#Create container for results
RESULTS = pd.DataFrame()

for strength in regstrength:
    print("\nChecking regstrength {}".format(strength))
    cv_results = pd.DataFrame()
    
    for opus in data['op'].unique():
        print("\nValidating on opus {}".format(opus))

        #Split into training and validation
        valid = data[data['op'] == opus]
        train = data[data['op'] != opus]

        #Drop the opus attribute since it's no longer needed
        valid = valid.drop(columns='op')
        train = train.drop(columns='op')

        #Generate sequences from the data
        valid_in, valid_out = generate_sequences(valid, valid, seq_length)
        train_in, train_out = generate_sequences(train, train, seq_length)

        #Create model
        model = lstm(train_in, train_out, optimizer, loss, metrics, strength)

        #Train on the folds
        model.fit(train_in,
                  train_out,
                  epochs = epochs,
                  verbose = verbose,
                  validation_data = (valid_in, valid_out),
                  callbacks = callbacks_list)

        #Save the history object for the model, appending test opus and regstrength
        history = pd.DataFrame(model.history.history)
        history.index.name = 'epoch'
        history['opus'] = opus
        history['reg'] = strength
        cv_results = cv_results.append(history)
    
    RESULTS = RESULTS.append(cv_results)

print("Done!")

Start!

Checking regstrength 0

Validating on opus 127
Train on 25863 samples, validate on 2210 samples
Epoch 1/2
 - 96s - loss: 3.6912 - acc: 0.1089 - val_loss: 3.6521 - val_acc: 0.1276
Epoch 2/2
 - 113s - loss: 3.4746 - acc: 0.1446 - val_loss: 3.5345 - val_acc: 0.1339

Validating on opus 95
Train on 26795 samples, validate on 1278 samples
Epoch 1/2
 - 136s - loss: 3.6951 - acc: 0.1107 - val_loss: 3.6932 - val_acc: 0.0829
Epoch 2/2
 - 98s - loss: 3.5141 - acc: 0.1373 - val_loss: 3.4058 - val_acc: 0.1862


Unnamed: 0_level_0,val_loss,val_acc,loss,acc,opus,reg
epoch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,3.65214,0.127602,3.691205,0.108881,127,0
1,3.534467,0.133937,3.474595,0.144608,127,0
0,3.693155,0.082942,3.695132,0.11073,95,0
1,3.405773,0.186228,3.514107,0.137264,95,0



Checking regstrength 0.1

Validating on opus 127
Train on 25863 samples, validate on 2210 samples
Epoch 1/2
 - 89s - loss: 5.8716 - acc: 0.1122 - val_loss: 3.8434 - val_acc: 0.0733
Epoch 2/2
 - 81s - loss: 3.7582 - acc: 0.1151 - val_loss: 3.8054 - val_acc: 0.1204

Validating on opus 95
Train on 26795 samples, validate on 1278 samples
Epoch 1/2
 - 92s - loss: 5.8109 - acc: 0.1132 - val_loss: 3.9013 - val_acc: 0.0829
Epoch 2/2
 - 84s - loss: 3.7561 - acc: 0.1160 - val_loss: 3.8192 - val_acc: 0.0829


Unnamed: 0_level_0,val_loss,val_acc,loss,acc,opus,reg
epoch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,3.65214,0.127602,3.691205,0.108881,127,0.0
1,3.534467,0.133937,3.474595,0.144608,127,0.0
0,3.693155,0.082942,3.695132,0.11073,95,0.0
1,3.405773,0.186228,3.514107,0.137264,95,0.0
0,3.843412,0.073303,5.871553,0.112168,127,0.1
1,3.805442,0.120362,3.758235,0.115145,127,0.1
0,3.901282,0.082942,5.810905,0.113155,95,0.1
1,3.819213,0.082942,3.756123,0.115954,95,0.1



Checking regstrength 0.5

Validating on opus 127
Train on 25863 samples, validate on 2210 samples
Epoch 1/2
 - 90s - loss: 14.1401 - acc: 0.1137 - val_loss: 3.9550 - val_acc: 0.1204
Epoch 2/2
 - 81s - loss: 3.8531 - acc: 0.1140 - val_loss: 3.8756 - val_acc: 0.0733

Validating on opus 95
Train on 26795 samples, validate on 1278 samples
Epoch 1/2
 - 92s - loss: 13.7865 - acc: 0.1158 - val_loss: 4.0011 - val_acc: 0.0829
Epoch 2/2
 - 81s - loss: 3.8488 - acc: 0.1174 - val_loss: 3.9003 - val_acc: 0.0829


Unnamed: 0_level_0,val_loss,val_acc,loss,acc,opus,reg
epoch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,3.65214,0.127602,3.691205,0.108881,127,0.0
1,3.534467,0.133937,3.474595,0.144608,127,0.0
0,3.693155,0.082942,3.695132,0.11073,95,0.0
1,3.405773,0.186228,3.514107,0.137264,95,0.0
0,3.843412,0.073303,5.871553,0.112168,127,0.1
1,3.805442,0.120362,3.758235,0.115145,127,0.1
0,3.901282,0.082942,5.810905,0.113155,95,0.1
1,3.819213,0.082942,3.756123,0.115954,95,0.1
0,3.95496,0.120362,14.140058,0.113676,127,0.5
1,3.875585,0.073303,3.853149,0.113985,127,0.5


Done!


# Results

In [209]:
BACKUP = RESULTS

AVERAGES = pd.DataFrame()

#For each level of regularization
for regularization, cvscores in RESULTS.groupby(level=0):
    average = pd.DataFrame()
    
    #Iterate through all folds and extract the highest validation score for each fold
    for opus, fold in cvscores.groupby(level=1):
        best = fold[fold['val_acc'] == fold['val_acc'].max()]
        average = average.append(best)
    
    #Make a pretty dataframe of the mean
    average = average.describe().loc[['mean']]
    average['reg'] = regularization
    average = average.set_index(average['reg']).drop(columns='reg')
    
    #Take the mean scores for this regularization value and store them in AVERAGE for comparisons
    AVERAGES = AVERAGES.append(average)

BEST = AVERAGES[AVERAGES['val_acc'] == AVERAGES['val_acc'].max()]

print("The highest cross valided scores and the corresponding regularization values")
display(BEST)

The highest cross valided scores and the corresponding regularization values


Unnamed: 0_level_0,val_loss,val_acc,loss,acc
reg,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0.0,3.47012,0.160083,3.494351,0.140936
