# TensorFlow Neural Network Model Building
This Jupyter Notebook is the product of Thomas Hymel. The contents pertain to one part of a Automatic Drum Transcription (ADT) project that I am working on to improve my data science and machine learning skills. This notebook in particular focuses on the neural network model building using Keras and TensorFlow. The data used for training the model will initially be the 25 song data set that has been created in other Jupyter Notebooks. The training data is in two matrices X (input) and Y (output labels) that will be loaded into this Notebook, along with a python dictionary that describes the X and Y data sets. 

#### Keras and TensorFlow
Keras is a deep learning API written in Python that runs on top of the machine learning platform TensorFlow. It allows users to quickly and easily build a model with layers as building blocks. I will be using Keras layers to build up a CNN, and perhaps a CRNN after that depending on the success (or expected lack of success) of the CNN. 

#### Considerations
This project falls under the "multi-label" classification model. This means that each example can be *multiple* classes of the available final classification options. Specifically, a bass drum event is independent from a snare drum event and from a hihat event and from a cymbal event and from a tom event, because a drummer, having multiple limbs, is able to simultaneously play multiple of these drum pieces. As such, the one-hot matrix Y can have multiple 1s in any given example row. 

Because of this, the final dense layer in the model **should not** be using the softmax ativation function. The softmax function chooses exactly one class and it is weighted by the presence of the other classes. Thus the activation function for the final layer needs to be **sigmoid function** because it properly projects the input to a probability between 0 and 1 without any consideration of or weighting from the other classes (using activation = 'sigmoid'). Additionally, the loss function used when the model is compiled should be loss='binary_crossentropy'

In [43]:
# import relevant packages
import pandas as pd
import numpy as np
import random
import json
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import *

# get the X, Y, and note_dict data from the 25 song training set
fp = "C:\\Users\\Thomas\\Python Projects\\Drum-Tabber-Support-Data\\Saved-Output\\"
X_temp = np.load(fp + 'X_june29th.npy')
Y_all = np.load(fp + 'Y_june29th.npy')
with open(fp + 'note_dict_june29th.json') as json_file: 
    note_dict = json.load(json_file)
X_all = np.transpose(X_temp, (1,0,2))   # because of the way I had previously output the numpy array, should put the num_examples first

'\n# get the X, Y, and note_dict data from the 25 song training set\nfp = "C:\\Users\\Thomas\\Python Projects\\Drum-Tabber-Support-Data\\Saved-Output\\"\nX_temp = np.load(fp + \'X_june29th.npy\')\nY_all = np.load(fp + \'Y_june29th.npy\')\nwith open(fp + \'note_dict_june29th.json\') as json_file: \n    note_dict = json.load(json_file)\nX_all = np.transpose(X_temp, (1,0,2))   # because of the way I had previously output the numpy array, should put the num_examples first\n'

In [83]:
print("X_all.shape", X_all.shape)
print("Y_all.shape", Y_all.shape)
print(list(note_dict.keys()))
print(note_dict['X song index range'])
index_range_d = note_dict['X song index range']
print(int(0.15 * len(note_dict['X song index range'].keys())))
random_sample = sorted(random.sample(list(index_range_d.keys()), k = 4))
random_sample2 = sorted(random.sample([x for x in list(index_range_d.keys()) if x not in random_sample], k = 4))
print(random_sample)
print(random_sample2)
combined = [random_sample, random_sample2]
print(combined)
                         
print(index_range_d[random_sample[0]])
print(X_all[index_range_d[random_sample[0]][0]:index_range_d[random_sample[0]][1]][:][:].shape)

Xtest, Ytest, Xdtest, Ydtest, Xttest, Yttest = split_XY(X_all, Y_all, note_dict)

X_all.shape (199568, 200, 3)
Y_all.shape (199568, 7, 3)
['n_mels', 'sr', 'n_spectro_slices', 'include_LR', 'include_differential', 'X song order', 'X song length', 'X song index range', 'X spectro index order', 'Y column index order']
{'ancient_tombs': [0, 11523], 'best_of_me': [11524, 18031], 'boulevard_of_broken_dreams': [18032, 23787], 'cant_be_saved': [23788, 32223], 'face_down': [32224, 41459], 'family_tradition': [41460, 51247], 'fireworks_at_dawn': [51248, 53675], 'forever_at_last': [53676, 60819], 'four_years': [60820, 69351], 'garden_state': [69352, 79979], 'gunpowder': [79980, 92575], 'hair_of_the_dog': [92576, 100187], 'lungs_like_gallows': [100188, 108071], 'misery_business': [108072, 117667], 'mookies_last_christmas': [117668, 125407], 'planning_a_prison_break': [125408, 140571], 'rollercoaster': [140572, 147871], 'sow': [147872, 153779], 'sugar_were_going_down': [153780, 158263], 'surprise_surprise': [158264, 165747], 'thats_what_you_get': [165748, 173135], 'the_dark': [1

In [84]:
print(Xdtest.shape, Ydtest.shape, Xttest.shape, Yttest.shape)

(34745, 200, 3) (34745, 7, 3) (19273, 200, 3) (19273, 7, 3)


#### split_XY function
Although there are probably TensorFlow functions that automatically do the following, I am going to write a function that splits the X and Y data sets into training, development (dev) and tests sets. I am doing this because I am attempting to preserve the order of the examples, since they are still in chronological order. 

In [79]:
def split_XY(X,Y, note_dict, dev_split = 0.15, test_split = 0.15, sample_each_song = False):
    """
    Splits the full X and Y data set into a train, dev, and test split, based normally on the choice of random SONGS,
    unless the sample each song boolean is true. If true, the data set is split in such a way that it takes a sequential portion
    of each song, where the total portions are roughly each equal to the dev and test splits.
    
    Args:
        X [np.array]:
        Y [np.array]:
        note_dict [dict]: dictionary created in create_XY used to describe the 
        dev_split [float]:
        test_split [float]:
        sample_each_song [bool]:
        
    Returns:
        np.array: X_train
        np.array: Y_train
        np.array: X_dev
        np.array: Y_dev
        np.array: X_test
        np.array: Y_test
    """
    
    song_index_dict = note_dict['X song index range']  # gets the dict where { song_title : index_range_of_that_songs_examples ([index_start, index_end]) }
    songs = list(song_index_dict.keys())     # list of song title strings in entire data set
    n_songs = len(songs)     # number of songs in the entire data set
    
    if not sample_each_song: # in the case where each song is NOT sampled, and instead entire songs are chosen for the dev and test sets
        n_songs_dev = int(dev_split*n_songs)  # int number of songs in dev set
        n_songs_test = int(test_split*n_songs)  # int number of songs in test set
        songs_dev = sorted(random.sample(songs, k = n_songs_dev))    # grab dev number of songs from the songs list, and sorts alphabetically
        songs_test = sorted(random.sample([x for x in songs if x not in songs_dev], k = n_songs_test))  # grab test number of songs from the songs list, but not including the dev list, and sorts alphabetically
        
        set_list = [songs_dev, songs_test]  # ha, set_list, like a band would play live a bunch of songs
        
        XdYdXtYt = [] # list of np.arrays that will correspond to, in order, X_dev, Y_dev, X_test, Y_test
        # get slices of X,Y that correspond to the dev set first time through, test set second time through
        for song_list in set_list:
            X_dt_list = []   # will be a list of np.arrays from X
            Y_dt_list = []   # will be a list of np.arrays from Y
            for song in song_list:
                index_range = song_index_dict[song]
                X_song = X[ index_range[0]:index_range[1] ][:][:]
                Y_song = Y[ index_range[0]:index_range[1] ][:][:]
                X_dt_list.append(X_song)
                Y_dt_list.append(Y_song)
            X_dt = np.concatenate(X_dt_list, axis = 0)
            Y_dt = np.concatenate(Y_dt_list, axis = 0)
            XdYdXtYt.append(X_dt)
            XdYdXtYt.append(Y_dt)
        
        # set the outputs equal to their proper sets from the list
        X_dev, Y_dev, X_test, Y_test = XdYdXtYt[0], XdYdXtYt[1], XdYdXtYt[2], XdYdXtYt[3]
        
    
    else: # in the case where each song IS sampled roughly equally according to the dev_split and test_split, still in chrono order in the songs and alphabetical by song title
        None
    
    # NEED TO REMOVE THE XY_DEV and XY_TEST LATER
    X_train = X
    Y_train = Y
    
    return X_train, Y_train, X_dev, Y_dev, X_test, Y_test

In [31]:
def CNN_2018(input_shape, n_classes, n_mels, include_differential, include_LR):
    """
    Creates a Keras NN model for the processing of the data. 
    Note that this NN model is based on the "Towards Multi-Instrument Drum Transcription" paper's model
    
    Args:
        input_shape
        
    Returns:
        Keras Model:
    """
    
    x = Input(shape = input_shape, name = 'input', dtype = 'float32')
    
    # normalization of the initial input data
    y = BatchNormalization(axis=2)(x)
    
    # 2 x convolutional layer (32 filter x  3x3)
    y = Conv2D()(y)
    y = Activation('relu')(y)
    y = BatchNormalization()(y)
    y = Conv2D()(y)
    y = Activation('relu')(y)
    y = BatchNormalization()(y)
    
    # max pool (1x3)
    y = MaxPool2D()(y)
    
    # 2 x convolutional layer (32 filter x  3x3)
    
     # max pool (1x3)
    y = MaxPool2D()(y)
    
    # 2 x dense (256)
    y = Flatten()(y)
    y = Dense(64, activation = 'relu')(y)
    
    
    return Model(inputs = x, outputs = y)

#### List of Useful Shortcuts

* Ctrl + shift + P = List of Shortcuts
* Enter (command mode) = Enter Edit Mode (enter cell to edit it)
* Esc (edit mode) = Enter Command Mode (exit cell)
* A = Create Cell above
* B = Create Cell below
* D,D = Delete Cell
* Shift + Enter = Run Cell (code or markdown)
* M = Change Cell to Markdown
* Y = Change Cell to Code
* Ctrl + Shift + Minus = Split Cell at Cursor