### Crude neural network implementation for song year predicions

The following walks you through the code we developed for a basic neural netowrk model using SGD. This neural network is trained on a simiplified version of the Million Song Dataset available at: http://archive.ics.uci.edu/ml/datasets/YearPredictionMSD

This dataset has 90 features for all 515,345 examples. 12 of these features are timbre averages, while 78 are timbre covariance.  This is one of several initial exploratory experiments that we will run to gain a better sense of the data at hand.

Import relevant libraries

In [40]:
from sklearn.neural_network import MLPClassifier

from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_recall_fscore_support
import numpy as np

'Rank' function produces some warnings-- this line suppresses those warnings.

In [41]:
import warnings
warnings.filterwarnings("ignore")

Declare empty lists for our examples and labels

In [42]:
labels = []
examples = []

Change this path to to the path where you have the csv file on your computer. 

In [43]:
filename = "/mnt/c/Users/Aumit/Desktop/YearPredictionMSD.txt/yp.csv"

This loop goes through every line in the csv, adds the first column (label) to the labels list, and then adds the rest of the columns to the examples list.

In [44]:
with open(filename, 'r') as f:
    for line in f:
        content = line.split(",")
        
        labels.append(int(content[0]))

        content.pop(0)

        content = [float(elem) for elem in content]

        # If we want a list of numpy arrays, not necessary
        #npa = np.asarray(content, dtype=np.float64)

        examples.append(content)

Training and test split. We have over 500,000 examples to work with, so we can train our model and see if it works better on larger amounts of data. For now, this crude model is only training on the first 10,000 examples (this is for speed). 

It may also be worthwhile to create a CV set so we can tune various parameters such as alpha and the number of layers/nodes per layer. 

In [45]:
training_examples = examples[:10000]
training_labels = labels[:10000]

test_examples = examples[-1000:]
test_labels = labels[-1000:]

Defining out classifier. This particular classifier will be using Stochastic Gradient Descent with an alpha of 1e-5. These parameters can be tuned. Currently, this neural network has 4 hidden layers, each with 100 hidden nodes. 

In [46]:
clf = MLPClassifier(solver='sgd', alpha=1e-5,
                     hidden_layer_sizes=(100, 100, 100, 100), random_state=1)

Fit the training examples to the training labels using the network architecture defined above. 

In [47]:
clf.fit(training_examples, training_labels)
MLPClassifier(activation='relu', alpha=0.001, batch_size='auto',
       beta_1=0.9, beta_2=0.999, early_stopping=False,
       epsilon=1e-08, hidden_layer_sizes=(100, 100, 100, 100), learning_rate='constant',
       learning_rate_init=0.001, max_iter=500, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
       solver='sgd', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

MLPClassifier(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(100, 100, 100, 100), learning_rate='constant',
       learning_rate_init=0.001, max_iter=500, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
       solver='sgd', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

Produce predictions on our test examples based on the classifier that we defined above. 

In [48]:
y_pred = clf.predict(test_examples)

Get the accuracy of these predictions based on the test labels that we used above.

In [50]:
accuracy_score(test_labels, y_pred) * 100


7.5

Get the precision, recall, and f score of our model. Accuracy is a good metric, but it doesn't tell the entire story. We should also look at these other metrics.

In [51]:
precision_recall_fscore_support(test_labels, y_pred, average="micro")

(0.074999999999999997, 0.074999999999999997, 0.074999999999999997, None)