# Deep learning Bach. A step-by-step guide.

I decided to merge two of my great passions: for Artificial Intelligence and for Bach music and try to guide You step-by-step through creating a computer system, which by examining Bach's collected works learns itself to compose "like Bach"... Seriously, at the end of this tutorial you will have a working artificial composer and hopefuly you will also understand how and why it works. And I am not going to lie to you: similar systems of course already exist, created by very smart university and corporate reserchers, but also by hobbyists.

We live in exciting times. Amount of data available on-line in public domain is incredible and tools that allow to manipulate that data in really interesting ways (read: Machine Learning) matured. One thing in all of that is really to be thankful for: somehow the idea of sharing prevails and lot of extremely valuable stuff just lies there, waiting to be used and very smart people spend incredible amounts of their time making even more stuff publicly available and understandable. Kudos. I owe them. Hence this guide.

Callout: Why Bach?

Callout: What is a Machine Learning / Deep Learning

So, who is the audience of this guide?

Cerainly you are not an expert in deep learning nor in musicology. You 

So, what are we going to do?

Lets try to sketch

## Plan of attack

As always divide and concquer is highly succesful strategy, so let's try:

1. We will start by creating a workbench we're going to use.
2. Then we need to get a lot of example data to teach our composer
3. Understand the data enough to make it useful
4. Prepare it so it's suitable for ML
5. Build a neural network
6. And teach it using the data we prepared
7. Paradoxically, this is not the end. We now need to uncover all that innate knowledge and make it express itself in writing
8. Now, let's try to ask it to compose something for us
9. Turn that partiture into playable MIDI file and...
10. Finally, play it!

Callout: Useful links.

In [86]:
import numpy as np
import glob
import sys
import collections
import random
import math
from os.path import basename
from itertools import permutations

vocabulary_size = 30**4
MAX_VOICES = 4

def build_dataset(words):
  count = [['UNK', -1]]
  count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
  dictionary = dict()
  for word, _ in count:
    dictionary[word] = len(dictionary)
  data = list()
  unk_count = 0
  for word in words:
    if word in dictionary:
      index = dictionary[word]
    else:
      index = 0  # dictionary['UNK']
      unk_count = unk_count + 1
    data.append(index)
  count[0][1] = unk_count
  reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys())) 
  return data, count, dictionary, reverse_dictionary

def permutate(s, pat):
    assert len(s) == len(pat), "string len: {} != pattern len: {}".format(len(s), len(pat))
    r = []
    for ind in pat:
        r.append(s[ind])
    return "".join(r)

def read_data_files(path, validation=True):
    """Read data files according to the specified glob pattern
    Optionnaly set aside the last file as validation data.
    No validation data is returned if there are 5 files or less.
    :param directory: for example "data/*.txt"
    :param validation: if True (default), sets the last file aside as validation data
    :return: training data, validation data, list of loaded file names with ranges
     If validation is
    """
    codetext = []
    opusranges = []
    bachlist = glob.glob(path + '/**/*.txt', recursive=True)
    for bachfile in bachlist:
        bachtext = open(bachfile, "r", encoding='utf8')
        start = len(codetext)
        bars = (bachtext.read()).split("!")
        bars2 = []
        nb_voices = len(bars[0])
        if nb_voices<=MAX_VOICES:
            print("Loading file: {} ; {} voices".format(bachfile, nb_voices))
            for bar in bars: 
                bar2 = bar.ljust(MAX_VOICES, " ")
                #print("'{}', '{}'".format(bar, bar2))
                bars2.append(bar2)
            #print("bars2:", bars2)
            codetext.extend(bars2)
            end = len(codetext)
            opusranges.append({"start": start, "end": end, "name": basename(bachfile).split(".")[0]})
            bachtext.close()

            patterns = list(permutations(range(MAX_VOICES)))

            for pattern_no in range(1,len(patterns)):
                start2 = len(codetext)
                bars = []
                for j in range(start, end-1): #iterate over the whole opus except the final divider ("********")
                    #print(codetext[j], patterns[pattern_no])
                    bars.append(permutate(codetext[j],patterns[pattern_no]))
                assert codetext[end-1] == "********", "expecting '********', instead '{}'".format(codetext[end])
                bars.append(codetext[end-1]) 
                codetext.extend(bars)
                end2 = len(codetext)
                opusranges.append({"start": start2, "end": end2, "name": "Perm"+str(pattern_no)+basename(bachfile).split(".")[0]})
        else:
            print("Skipping file: {} ; {} voices".format(bachfile, nb_voices))
    if len(opusranges) == 0:
        sys.exit("No training data has been found. Aborting.")
    
    total_len = len(codetext)
    
    data, count, dictionary, reverse_dictionary = build_dataset(codetext)
    
    # For validation, use roughly 90K of text,
    # but no more than 10% of the entire text
    # and no more than 1 book in 5 => no validation at all for 5 files or fewer.

    # 10% of the text is how many files ?
    validation_len = 0
    nb_opus1 = 0
    for opus in reversed(opusranges):
        validation_len += opus["end"]-opus["start"]
        nb_opus1 += 1
        if validation_len > total_len // 10:
            break

    # 90K of text is how many books ?
    validation_len = 0
    nb_opus2 = 0
    for opus in reversed(opusranges):
        validation_len += opus["end"]-opus["start"]
        nb_opus2 += 1
        if validation_len > 90*1024:
            break

    # 20% of the books is how many books ?
    nb_opus3 = len(opusranges) // 5

    # pick the smallest
    nb_opus = min(nb_opus1, nb_opus2, nb_opus3)

    if nb_opus == 0 or not validation:
        cutoff = total_len
    else:
        cutoff = opusranges[-nb_opus]["start"]
    validata = data[cutoff:]
    codedata = data[:cutoff]
    return data, codedata, validata, opusranges, count, dictionary, reverse_dictionary

In [87]:
%pwd

'/Users/Zufek/ml/music_rnn'

In [88]:
PATH = "../../ml/music_rnn/txt"

data, codetext, valitext, opusranges, count, dictionary, reverse_dictionary = read_data_files(PATH, validation=True)

#data, count, dictionary, reverse_dictionary = build_dataset(dypthongs)
#print('Most common words (+UNK)', count[:5])
#print('Sample data', data[:10])

Skipping file: ../../ml/music_rnn/txt/major/9/bjsbmm12.txt ; 9 voices
Skipping file: ../../ml/music_rnn/txt/major/9/bjsbmm07.txt ; 9 voices
Skipping file: ../../ml/music_rnn/txt/major/9/bjsbmm14.txt ; 9 voices
Skipping file: ../../ml/music_rnn/txt/major/9/bwv29sin.txt ; 9 voices
Skipping file: ../../ml/music_rnn/txt/major/9/bwv667.txt ; 9 voices
Skipping file: ../../ml/music_rnn/txt/major/11/bwv0202.txt ; 11 voices
Skipping file: ../../ml/music_rnn/txt/major/7/bwv668.txt ; 7 voices
Skipping file: ../../ml/music_rnn/txt/major/7/bwv1041b.txt ; 7 voices
Skipping file: ../../ml/music_rnn/txt/major/7/bwv988.txt ; 7 voices
Skipping file: ../../ml/music_rnn/txt/major/16/BOURREE.txt ; 16 voices
Skipping file: ../../ml/music_rnn/txt/major/16/bwv0243.txt ; 16 voices
Skipping file: ../../ml/music_rnn/txt/major/16/GIGUE.txt ; 16 voices
Skipping file: ../../ml/music_rnn/txt/major/16/GAVOTTE.txt ; 16 voices
Skipping file: ../../ml/music_rnn/txt/major/16/rejouiss.txt ; 16 voices
Skipping file: ../../

Loading file: ../../ml/music_rnn/txt/major/3/988-v09.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/pfa-1pre.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/vp3-2lou.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/cs1-6gig.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/988-v24.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/988-v19.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/sonat_4c.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/bwv525-3.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/cs3-5bou.txt ; 3 voices
Skipping file: ../../ml/music_rnn/txt/major/3/Wtcii03b.txt ; 8 voices
Loading file: ../../ml/music_rnn/txt/major/3/bwv525-1.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/sonat_6e.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/988-v03.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/major/3/988-v02.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/m

Loading file: ../../ml/music_rnn/txt/minor/4/cs2-1pre.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/cs6-2all.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/vp1-7tb.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/cs5-1pre.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/vp2-1all.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/bwv971.txt ; 3 voices
Loading file: ../../ml/music_rnn/txt/minor/4/988-v22.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/bwv582.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/vs1-1ada.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/Wtcii02b.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/Wtcii22b.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/vp1-5sa.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/bjsbmm22.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/minor/4/bwv1074i.txt ; 4 voices
Loading file: ../../ml/music_rnn/txt/mino

Loading file: ../../ml/music_rnn/txt/minor/2/vs2-4alg.txt ; 2 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/sonat_6c.txt ; 8 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/bwv807a.txt ; 8 voices
Loading file: ../../ml/music_rnn/txt/minor/2/988-v05.txt ; 2 voices
Loading file: ../../ml/music_rnn/txt/minor/2/988-v11.txt ; 2 voices
Loading file: ../../ml/music_rnn/txt/minor/2/997-1pre.txt ; 2 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/bwv810d.txt ; 8 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/bwv811b.txt ; 8 voices
Loading file: ../../ml/music_rnn/txt/minor/2/bwv805th.txt ; 2 voices
Loading file: ../../ml/music_rnn/txt/minor/2/bwv807b.txt ; 2 voices
Loading file: ../../ml/music_rnn/txt/minor/2/bwv807c.txt ; 2 voices
Loading file: ../../ml/music_rnn/txt/minor/2/bwv811c.txt ; 2 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/bwv810g.txt ; 8 voices
Skipping file: ../../ml/music_rnn/txt/minor/2/Wtcii10a.txt ; 8 voices
Loading file: ../../ml/music_rnn/txt/

In [239]:
reverse_dictionary

{0: 'UNK',
 1: '    ',
 2: 'X   ',
 3: ' X  ',
 4: '  X ',
 5: '   X',
 6: 'Q   ',
 7: ' Q  ',
 8: '  Q ',
 9: '   Q',
 10: 'S   ',
 11: ' S  ',
 12: '  S ',
 13: '   S',
 14: ']   ',
 15: ' ]  ',
 16: '  ] ',
 17: '   ]',
 18: 'L   ',
 19: ' L  ',
 20: '  L ',
 21: '   L',
 22: 'V   ',
 23: ' V  ',
 24: '  V ',
 25: '   V',
 26: '_   ',
 27: ' _  ',
 28: '  _ ',
 29: '   _',
 30: 'd   ',
 31: ' d  ',
 32: '  d ',
 33: '   d',
 34: 'T   ',
 35: ' T  ',
 36: '  T ',
 37: '   T',
 38: 'b   ',
 39: ' b  ',
 40: '  b ',
 41: '   b',
 42: 'U   ',
 43: ' U  ',
 44: '  U ',
 45: '   U',
 46: 'i   ',
 47: ' i  ',
 48: '  i ',
 49: '   i',
 50: 'P   ',
 51: ' P  ',
 52: '  P ',
 53: '   P',
 54: '`   ',
 55: ' `  ',
 56: '  ` ',
 57: '   `',
 58: 'N   ',
 59: ' N  ',
 60: '  N ',
 61: '   N',
 62: 'Z   ',
 63: ' Z  ',
 64: '  Z ',
 65: '   Z',
 66: '[   ',
 67: ' [  ',
 68: '  [ ',
 69: '   [',
 70: 'J   ',
 71: ' J  ',
 72: '  J ',
 73: '   J',
 74: 'Y   ',
 75: ' Y  ',
 76: '  Y ',
 77: '   Y

In [None]:
from tensorflow.contrib.keras import models as tfm
from tensorflow.contrib.keras import layers as tfl
from tensorflow.contrib.keras import optimizers
from tensorflow.contrib.keras import regularizers
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold
from tensorflow.contrib.keras import wrappers as tfw
import tensorflow as tf
from tensorflow.contrib.keras import backend as K
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12

n_vocab = len(dictionary)+1
#n_embed_size = 20
p_dropout = 0.3

SEQLEN = 64
BATCHSIZE = 32
INTERNALSIZE = 256
NB_EPOCHS = 10

def cos_distance(y_true, y_pred):
    y_true = K.l2_normalize(y_true, axis=-1)
    y_pred = K.l2_normalize(y_pred, axis=-1)
    return K.mean(1 - K.sum((y_true * y_pred), axis=-1))

def rnn_minibatch_sequencer(raw_data, batch_size, sequence_size, nb_epochs):
    """
    Divides the data into batches of sequences so that all the sequences in one batch
    continue in the next batch. This is a generator that will keep returning batches
    until the input data has been seen nb_epochs times. Sequences are continued even
    between epochs, apart from one, the one corresponding to the end of raw_data.
    The remainder at the end of raw_data that does not fit in an full batch is ignored.
    :param raw_data: the training text
    :param batch_size: the size of a training minibatch
    :param sequence_size: the unroll size of the RNN
    :param nb_epochs: number of epochs to train on
    :return:
        x: one batch of training sequences
        y: on batch of target sequences, i.e. training sequences shifted by 1
        epoch: the current epoch number (starting at 0)
    """
    data = raw_data
    data_len = data.shape[0]
    print(data_len)
    # using (data_len-1) because we must provide for the sequence shifted by 1 too
    nb_batches = (data_len - 1) // (batch_size * sequence_size)
    assert nb_batches > 0, "Not enough data, even for a single batch. Try using a smaller batch_size."
    rounded_data_len = nb_batches * batch_size * sequence_size
    xdata = np.reshape(data[0:rounded_data_len, :], [batch_size, nb_batches * sequence_size, MAX_VOICES*2])
    ydata = np.reshape(data[1:rounded_data_len + 1, :], [batch_size, nb_batches * sequence_size, MAX_VOICES*2])

    for epoch in range(nb_epochs):
        for batch in range(nb_batches):
            x = xdata[:, batch * sequence_size:(batch + 1) * sequence_size]
            y = ydata[:, batch * sequence_size:(batch + 1) * sequence_size]
            x = np.roll(x, -epoch, axis=0)  # to continue the text from epoch to epoch (do not reset rnn state!)
            y = np.roll(y, -epoch, axis=0)
            yield x, y, epoch
            
def rnn_validata_sequencer(raw_data, batch_size, sequence_size, nb_epochs):
    """
    Divides the data into batches of sequences so that all the sequences in one batch
    continue in the next batch. This is a generator that will keep returning batches
    until the input data has been seen nb_epochs times. Sequences are continued even
    between epochs, apart from one, the one corresponding to the end of raw_data.
    The remainder at the end of raw_data that does not fit in an full batch is ignored.
    :param raw_data: the training text
    :param batch_size: the size of a training minibatch
    :param sequence_size: the unroll size of the RNN
    :param nb_epochs: number of epochs to train on
    :return:
        x: one batch of training sequences
        y: on batch of target sequences, i.e. training sequences shifted by 1
        epoch: the current epoch number (starting at 0)
    """
    data = raw_data
    data_len = data.shape[0]
    print(data_len)
    # using (data_len-1) because we must provide for the sequence shifted by 1 too
    nb_batches = (data_len - 1) // (batch_size * sequence_size)
    assert nb_batches > 0, "Not enough data, even for a single batch. Try using a smaller batch_size."
    rounded_data_len = nb_batches * batch_size * sequence_size
    xdata = np.reshape(data[0:rounded_data_len, :], [batch_size, nb_batches * sequence_size, MAX_VOICES*2])
    ydata = np.reshape(data[1:rounded_data_len + 1, :], [batch_size, nb_batches * sequence_size, MAX_VOICES*2])

    for epoch in range(nb_epochs):
        for batch in range(nb_batches):
            x = xdata[:, batch * sequence_size:(batch + 1) * sequence_size]
            y = ydata[:, batch * sequence_size:(batch + 1) * sequence_size]
            x = np.roll(x, -epoch, axis=0)  # to continue the text from epoch to epoch (do not reset rnn state!)
            y = np.roll(y, -epoch, axis=0)
            yield x, y, epoch

def create_model():
# create model
    model = tfm.Sequential()
    model.add(tfl.LSTM(INTERNALSIZE, return_sequences=True,
               input_shape=(SEQLEN, MAX_VOICES*2), dropout = p_dropout, recurrent_dropout = p_dropout)) 
    #model.add(tfl.Dropout(p_dropout))
    model.add(tfl.LSTM(INTERNALSIZE, return_sequences=True,
             input_shape=(SEQLEN, MAX_VOICES*2), dropout = p_dropout, recurrent_dropout = p_dropout))  # returns a sequence of vectors of dimension 32
    #model.add(tfl.Dropout(p_dropout))
    model.add(tfl.LSTM(INTERNALSIZE, return_sequences=True,
             input_shape=(SEQLEN, MAX_VOICES*2), dropout = p_dropout, recurrent_dropout = p_dropout))  # returns a sequence of vectors of dimension 32
    model.add(tfl.TimeDistributed(tfl.Dense(MAX_VOICES*2, activation='relu')))
    model.compile(loss='mse',
              optimizer='adam',
              metrics=['accuracy'])
    return model

model = create_model()
print(model.summary())

step = 0
for x, y_, epoch in rnn_minibatch_sequencer(codevector, BATCHSIZE, SEQLEN, nb_epochs=NB_EPOCHS):
    history = model.train_on_batch(x, y_)
    if step % 50 == 0 and len(valitext) > 0:
        print("Traning epoch: {} / batch:{}, samples: {}".format(epoch, step, (epoch+1)*step*BATCHSIZE*SEQLEN))
        print(" Train Data Loss: {}, Accuracy: {}".format(history[0], history[1]))
        vali_x, vali_y, _ = next(rnn_validata_sequencer(valivector, BATCHSIZE, SEQLEN, nb_epochs=NB_EPOCHS)) 
        loss, accuracy = model.evaluate(vali_x, vali_y, verbose =1)
        print(" Validation Loss: {}, Accuracy: {}".format(loss, accuracy))       
    step += 1
    
     

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_28 (LSTM)               (None, None, 256)         271360    
_________________________________________________________________
lstm_29 (LSTM)               (None, None, 256)         525312    
_________________________________________________________________
lstm_30 (LSTM)               (None, None, 256)         525312    
_________________________________________________________________
time_distributed_5 (TimeDist (None, None, 8)           2056      
Total params: 1,324,040
Trainable params: 1,324,040
Non-trainable params: 0
_________________________________________________________________
None
5914089
Traning epoch: 0 / batch:0, samples: 0
 Train Data Loss: 0.21594806015491486, Accuracy: 0.13671875
92583
 Validation Loss: 0.11447607725858688, Accuracy: 0.2138671875
Traning epoch: 0 / batch:50, samples: 102400
 Train Data Loss: 0.07468955218791962, 

 Validation Loss: 0.025008469820022583, Accuracy: 0.6591796875
Traning epoch: 0 / batch:1700, samples: 3481600
 Train Data Loss: 0.02235705778002739, Accuracy: 0.587890625
92583
 Validation Loss: 0.024439290165901184, Accuracy: 0.62353515625
Traning epoch: 0 / batch:1750, samples: 3584000
 Train Data Loss: 0.026988986879587173, Accuracy: 0.76708984375
92583
 Validation Loss: 0.02483709342777729, Accuracy: 0.6337890625
Traning epoch: 0 / batch:1800, samples: 3686400
 Train Data Loss: 0.026462597772479057, Accuracy: 0.66259765625
92583
 Validation Loss: 0.023004677146673203, Accuracy: 0.6181640625
Traning epoch: 0 / batch:1850, samples: 3788800
 Train Data Loss: 0.031363703310489655, Accuracy: 0.57177734375
92583
 Validation Loss: 0.023420272395014763, Accuracy: 0.6357421875
Traning epoch: 0 / batch:1900, samples: 3891200
 Train Data Loss: 0.028696516528725624, Accuracy: 0.7158203125
92583
 Validation Loss: 0.022311927750706673, Accuracy: 0.64501953125
Traning epoch: 0 / batch:1950, samp

 Validation Loss: 0.021167371422052383, Accuracy: 0.67822265625
Traning epoch: 1 / batch:3550, samples: 14540800
 Train Data Loss: 0.020087983459234238, Accuracy: 0.69921875
92583
 Validation Loss: 0.02060190960764885, Accuracy: 0.740234375
Traning epoch: 1 / batch:3600, samples: 14745600
 Train Data Loss: 0.02348044328391552, Accuracy: 0.78369140625
92583
 Validation Loss: 0.02021585777401924, Accuracy: 0.6689453125
Traning epoch: 1 / batch:3650, samples: 14950400
 Train Data Loss: 0.016928989440202713, Accuracy: 0.74365234375
92583
 Validation Loss: 0.021327516064047813, Accuracy: 0.71044921875
Traning epoch: 1 / batch:3700, samples: 15155200
 Train Data Loss: 0.020298223942518234, Accuracy: 0.6611328125
92583
 Validation Loss: 0.021701565012335777, Accuracy: 0.6513671875
Traning epoch: 1 / batch:3750, samples: 15360000
 Train Data Loss: 0.0200120210647583, Accuracy: 0.6796875
92583
 Validation Loss: 0.020955845713615417, Accuracy: 0.7197265625
Traning epoch: 1 / batch:3800, samples:

 Validation Loss: 0.019892549142241478, Accuracy: 0.7001953125
Traning epoch: 1 / batch:5400, samples: 22118400
 Train Data Loss: 0.025778712704777718, Accuracy: 0.689453125
92583
 Validation Loss: 0.019909273833036423, Accuracy: 0.67138671875
Traning epoch: 1 / batch:5450, samples: 22323200
 Train Data Loss: 0.02275455743074417, Accuracy: 0.69921875
92583
 Validation Loss: 0.020041681826114655, Accuracy: 0.66455078125
Traning epoch: 1 / batch:5500, samples: 22528000
 Train Data Loss: 0.024137087166309357, Accuracy: 0.60546875
92583
 Validation Loss: 0.01978295110166073, Accuracy: 0.66943359375
Traning epoch: 1 / batch:5550, samples: 22732800
 Train Data Loss: 0.01953566074371338, Accuracy: 0.66796875
92583
 Validation Loss: 0.019533833488821983, Accuracy: 0.7275390625
Traning epoch: 1 / batch:5600, samples: 22937600
 Train Data Loss: 0.020978178828954697, Accuracy: 0.72314453125
92583
 Validation Loss: 0.020059548318386078, Accuracy: 0.73388671875
Traning epoch: 1 / batch:5650, sample

 Validation Loss: 0.019651371985673904, Accuracy: 0.6982421875
Traning epoch: 2 / batch:7250, samples: 44544000
 Train Data Loss: 0.019987137988209724, Accuracy: 0.73828125
92583
 Validation Loss: 0.01961713656783104, Accuracy: 0.7197265625
Traning epoch: 2 / batch:7300, samples: 44851200
 Train Data Loss: 0.01851845346391201, Accuracy: 0.728515625
92583
 Validation Loss: 0.01933671534061432, Accuracy: 0.7119140625
Traning epoch: 2 / batch:7350, samples: 45158400
 Train Data Loss: 0.01918385922908783, Accuracy: 0.71875
92583
 Validation Loss: 0.01989550143480301, Accuracy: 0.6708984375
Traning epoch: 2 / batch:7400, samples: 45465600
 Train Data Loss: 0.018893329426646233, Accuracy: 0.7138671875
92583
 Validation Loss: 0.019496535882353783, Accuracy: 0.697265625
Traning epoch: 2 / batch:7450, samples: 45772800
 Train Data Loss: 0.012684321962296963, Accuracy: 0.66845703125
92583
 Validation Loss: 0.019714146852493286, Accuracy: 0.64599609375
Traning epoch: 2 / batch:7500, samples: 4608

 Validation Loss: 0.02049875259399414, Accuracy: 0.6240234375
Traning epoch: 3 / batch:9100, samples: 74547200
 Train Data Loss: 0.018101060763001442, Accuracy: 0.599609375
92583
 Validation Loss: 0.02010771632194519, Accuracy: 0.66162109375
Traning epoch: 3 / batch:9150, samples: 74956800
 Train Data Loss: 0.02117731422185898, Accuracy: 0.65869140625
92583
 Validation Loss: 0.020302562043070793, Accuracy: 0.65380859375
Traning epoch: 3 / batch:9200, samples: 75366400
 Train Data Loss: 0.02082229033112526, Accuracy: 0.69580078125
92583
 Validation Loss: 0.019686827436089516, Accuracy: 0.62451171875
Traning epoch: 3 / batch:9250, samples: 75776000
 Train Data Loss: 0.018599899485707283, Accuracy: 0.70068359375
92583
 Validation Loss: 0.019069606438279152, Accuracy: 0.6611328125
Traning epoch: 3 / batch:9300, samples: 76185600
 Train Data Loss: 0.018743470311164856, Accuracy: 0.79248046875
92583
 Validation Loss: 0.019255777820944786, Accuracy: 0.6787109375
Traning epoch: 3 / batch:9350,

Traning epoch: 3 / batch:10900, samples: 89292800
 Train Data Loss: 0.020891934633255005, Accuracy: 0.70947265625
92583
 Validation Loss: 0.01767801120877266, Accuracy: 0.658203125
Traning epoch: 3 / batch:10950, samples: 89702400
 Train Data Loss: 0.016760360449552536, Accuracy: 0.68408203125
92583
 Validation Loss: 0.018079139292240143, Accuracy: 0.65869140625
Traning epoch: 3 / batch:11000, samples: 90112000
 Train Data Loss: 0.017756372690200806, Accuracy: 0.603515625
92583
 Validation Loss: 0.017520856112241745, Accuracy: 0.63330078125
Traning epoch: 3 / batch:11050, samples: 90521600
 Train Data Loss: 0.015408490784466267, Accuracy: 0.7119140625
92583
 Validation Loss: 0.017697522416710854, Accuracy: 0.61767578125
Traning epoch: 3 / batch:11100, samples: 90931200
 Train Data Loss: 0.019664134830236435, Accuracy: 0.66259765625
92583
 Validation Loss: 0.017646461725234985, Accuracy: 0.6640625
Traning epoch: 3 / batch:11150, samples: 91340800
 Train Data Loss: 0.018062468618154526, 

 Validation Loss: 0.018292700871825218, Accuracy: 0.6552734375
Traning epoch: 4 / batch:12750, samples: 130560000
 Train Data Loss: 0.015075846575200558, Accuracy: 0.68896484375
92583
 Validation Loss: 0.01844346523284912, Accuracy: 0.62646484375
Traning epoch: 4 / batch:12800, samples: 131072000
 Train Data Loss: 0.016428377479314804, Accuracy: 0.6103515625
92583
 Validation Loss: 0.018120838329195976, Accuracy: 0.650390625
Traning epoch: 4 / batch:12850, samples: 131584000
 Train Data Loss: 0.01666995882987976, Accuracy: 0.63525390625
92583
 Validation Loss: 0.018128249794244766, Accuracy: 0.634765625
Traning epoch: 4 / batch:12900, samples: 132096000
 Train Data Loss: 0.018306324258446693, Accuracy: 0.6279296875
92583
 Validation Loss: 0.018806634470820427, Accuracy: 0.6796875
Traning epoch: 4 / batch:12950, samples: 132608000
 Train Data Loss: 0.01596473529934883, Accuracy: 0.63330078125
92583
 Validation Loss: 0.018320579081773758, Accuracy: 0.62890625
Traning epoch: 4 / batch:130

Traning epoch: 5 / batch:14550, samples: 178790400
 Train Data Loss: 0.021779093891382217, Accuracy: 0.65576171875
92583
 Validation Loss: 0.017339710146188736, Accuracy: 0.6591796875
Traning epoch: 5 / batch:14600, samples: 179404800
 Train Data Loss: 0.02100466936826706, Accuracy: 0.61181640625
92583
 Validation Loss: 0.017189733684062958, Accuracy: 0.68408203125
Traning epoch: 5 / batch:14650, samples: 180019200
 Train Data Loss: 0.0212506465613842, Accuracy: 0.58251953125
92583
 Validation Loss: 0.016928283497691154, Accuracy: 0.65380859375
Traning epoch: 5 / batch:14700, samples: 180633600
 Train Data Loss: 0.019681448116898537, Accuracy: 0.6904296875
92583
 Validation Loss: 0.017731009051203728, Accuracy: 0.6474609375
Traning epoch: 5 / batch:14750, samples: 181248000
 Train Data Loss: 0.022214962169528008, Accuracy: 0.5986328125
92583
 Validation Loss: 0.018367623910307884, Accuracy: 0.60888671875
Traning epoch: 5 / batch:14800, samples: 181862400
 Train Data Loss: 0.02266601473

[0.96982968, 0.04296875]

In [70]:
def generate_batch(batch_size, num_skips, skip_window):
  global data_index
  assert batch_size % num_skips == 0
  assert num_skips <= 2 * skip_window
  batch = np.ndarray(shape=(batch_size), dtype=np.int32)
  labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
  span = 2 * skip_window + 1 # [ skip_window target skip_window ]
  buffer = collections.deque(maxlen=span)
  for _ in range(span):
    buffer.append(data[data_index])
    data_index = (data_index + 1) % len(data)
  for i in range(batch_size // num_skips):
    target = skip_window  # target label at the center of the buffer
    targets_to_avoid = [ skip_window ]
    for j in range(num_skips):
      while target in targets_to_avoid:
        target = random.randint(0, span - 1)
      targets_to_avoid.append(target)
      batch[i * num_skips + j] = buffer[skip_window]
      labels[i * num_skips + j, 0] = buffer[target]
    buffer.append(data[data_index])
    data_index = (data_index + 1) % len(data)
  return batch, labels

In [16]:
import tensorflow as tf

batch_size = 128
embedding_size = 20 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent. 
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64 # Number of negative examples to sample.

graph = tf.Graph()

with graph.as_default(), tf.device('/cpu:0'):

  # Input data.
  train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
  train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
  valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
  
  # Variables.
  # embeddings = tf.Variable(tf.random_uniform([n_vocab, embedding_size], -1.0, 1.0))
  embeddings = tf.Variable(final_embeddings) #use this to continue learning from the saved checkpoint
  softmax_weights = tf.Variable(
    tf.truncated_normal([vocabulary_size, embedding_size],
                         stddev=1.0 / math.sqrt(embedding_size)))
  softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))
  
  # Model.
  # Look up embeddings for inputs.
  embed = tf.nn.embedding_lookup(embeddings, train_dataset)
  # Compute the softmax loss, using a sample of the negative labels each time.
  loss = tf.reduce_mean(
    tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=embed,
                               labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))

  # Optimizer.
  # Note: The optimizer will optimize the softmax_weights AND the embeddings.
  # This is because the embeddings are defined as a variable quantity and the
  # optimizer's `minimize` method will by default modify all variable quantities 
  # that contribute to the tensor it is passed.
  # See docs on `tf.train.Optimizer.minimize()` for more details.
  optimizer = tf.train.AdamOptimizer(0.001).minimize(loss)
  
  # Compute the similarity between minibatch examples and all embeddings.
  # We use the cosine distance:
  norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
  normalized_embeddings = embeddings / norm
  valid_embeddings = tf.nn.embedding_lookup(
    normalized_embeddings, valid_dataset)
  similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))

In [5]:
from pathlib import Path
outfile = Path(PATH + "\\final_embeddings_3.npy")
final_embeddings = np.load(outfile)

In [17]:
num_steps = 2001
refresh_freq = 1000

with tf.Session(graph=graph) as session:
  tf.global_variables_initializer().run()
  print('Initialized')
  average_loss = 0
  for step in range(num_steps):
    batch_data, batch_labels = generate_batch(
      batch_size, num_skips, skip_window)
    feed_dict = {train_dataset : batch_data, train_labels : batch_labels}
    _, l = session.run([optimizer, loss], feed_dict=feed_dict)
    average_loss += l
    if step % refresh_freq == 0:
        if step > 0:
            average_loss = average_loss / refresh_freq
            # The average loss is an estimate of the loss over the last refresh_freq batches.
            print('Average loss at step %d: %f' % (step, average_loss))
            average_loss = 0
        sim = similarity.eval()
        for i in range(valid_size):
            valid_word = reverse_dictionary[valid_examples[i]]
            top_k = 6 # number of nearest neighbors
            nearest = (-sim[i, :]).argsort()[1:top_k+1]
            log = 'Nearest to %s:' % valid_word
            for k in range(top_k):
                close_word = reverse_dictionary[nearest[k]]
                log = '%s %s,' % (log, close_word)
            print(log)
  final_embeddings = normalized_embeddings.eval()

Initialized
Nearest to    d:    b,    c,    f, q J\,  P `, m] L,
Nearest to f   : h   , d   , i   , g   ,   R\, gLa[,
Nearest to    ]: QUVS, ]_GV,  fb[, `[ I,    _, MT] ,
Nearest to  L  :  P  ,  S  , ]ebS,    =,  Q  , ZL[O,
Nearest to    T:  NX , ` M[, JX L,   V ,   S ,  Qkk,
Nearest to  Q  :  S  ,  P  , ]VYJ,  O  , ZL[O,  XOQ,
Nearest to [   : Z   , V   , U   , Kg]H, S_D , fb l,
Nearest to  X  :  Z  ,  V  ,  U  , dd h, lhY ,  Yi[,
Nearest to   X :   Z ,   V ,   U , ni` , ^Hl , MXi],
Nearest to    P:    L,    Q,    N,    I,    O,    J,
Nearest to    a:    _,    c, JV^ , ZmN ,    d, dD \,
Nearest to  S  :  Q  ,  P  ,  O  ,  L  ,  XOQ, dJ[T,
Nearest to S   : U   , V   , s da, X   , Q   ,  _Kr,
Nearest to UNK: W_Wi, MiVb, W[_i,  j n,  `pP, gG _,
Nearest to   ] :   [ , t  _,  _Oh,   Z , B  ], ^i d,
Nearest to  E  :  I  ,  J  , a  p, aMQg, eid ,  H J,
Average loss at step 1000: 8.236681
Nearest to    d:    b,    c,    f,  P `,    a,    h,
Nearest to f   : h   , d   , W i , g   ,   R\, m Q ,

In [89]:
codevector = np.zeros((len(codetext), MAX_VOICES*2))
for i in range(len(codetext)):
    bar = reverse_dictionary[codetext[i]]
    for j in range(MAX_VOICES):
        codevector[i][2*j] = (ord(bar[j])-32)/127
        if bar[j] == " ":
            codevector[i][2*j+1] = 0
        else:
            codevector[i][2*j+1] = 1

In [90]:
valivector = np.zeros((len(valitext), MAX_VOICES*2))
for i in range(len(valitext)):
    bar = reverse_dictionary[valitext[i]]
    for j in range(MAX_VOICES):
        valivector[i][2*j] = (ord(bar[j])-32)/127
        if bar[j] == " ":
            valivector[i][2*j+1] = 0
        else:
            valivector[i][2*j+1] = 1

In [263]:
codevector

array([[ 0.57480315,  1.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.59055118,  1.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       [ 0.60629921,  1.        ,  0.        , ...,  0.        ,
         0.        ,  0.        ],
       ..., 
       [ 0.        ,  0.        ,  0.34645669, ...,  0.        ,
         0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.29133858, ...,  0.        ,
         0.        ,  0.        ],
       [ 0.07874016,  1.        ,  0.07874016, ...,  1.        ,
         0.07874016,  1.        ]])

In [44]:
for i in range(16):
    valid_word = reverse_dictionary[valid_examples[i]]
    top_k = 12 # number of nearest neighbors
    nearest = (-sim[i, :]).argsort()[1:top_k+1]
    log = 'Nearest to %s:' % valid_word
    for k in range(top_k):
          close_word = reverse_dictionary[nearest[k]]
          log = '%s %s,' % (log, close_word)
    print(log)

Nearest to    d:    b,    c,    f,  P `,    a,    h, Q Xb, HT V, m] L, aUgQ,  [QE,    i,
Nearest to f   : h   , d   , W i , g   ,   R\, m Q , b   , n^T , i   , [Y  , iU  , gLa[,
Nearest to    ]: QUVS,    _,  fb[, ]_GV, `[ I, a  o, []LB, MT] , F[b^, iQZa, QVMS, X Ob,
Nearest to  L  :  I  ,     ,  G  ,  N  ,  K  ,  P  ,   M ,   K ,    @,  S  ,  B  ,  J  ,
Nearest to    T:   X ,   V ,   J ,   U ,   O ,   S ,   Q ,   G ,   L ,   P ,   I ,   Z ,
Nearest to  Q  :  S  ,  P  ,  J  ,  E  ,  U  ,  I  ,  N  ,  L  ,  D  ,  V  ,  O  ,  X  ,
Nearest to [   : Z   , Kg]H, S_D , fb l, pLX ,  NN_, c\l , mW d, sZ p, X   , oi T, h XX,
Nearest to  X  :  E  ,  U  ,  V  ,  J  ,  I  ,  S  ,  Q  ,  Z  ,  P  ,   B ,  G  ,  L  ,
Nearest to   X :   V ,   U ,   S ,   Q ,   L ,   N ,   G ,   P ,   I ,   J , ********,   Z ,
Nearest to    P:    N,    Q,    L,    S,    U,    X,    J,    V,    R,    I,    O,    G,
Nearest to    a:    _,    c, JV^ , ZmN ,    d, E_  , dD \, c GS, fm X,  kXL, Z dK,  dYO,
Nearest to  S  : 

In [26]:
final_embeddings[0]

array([ 0.40275112, -0.04863255,  0.30368793,  0.06228257, -0.28215745,
       -0.04976769,  0.18830134, -0.0575086 ,  0.16519442,  0.07572614,
       -0.35413435, -0.136016  ,  0.05114105,  0.23464742, -0.01558306,
        0.21568705,  0.38817203, -0.12697509, -0.40035945, -0.1000249 ], dtype=float32)

In [11]:
from pathlib import Path
outfile = Path(PATH + "/final_embeddings_4.npy")
np.save(outfile, final_embeddings)

In [10]:
code_vector = np.df()

'C:\\Users\\krzysztof.pieranski\\Documents\\ml\\deepbach'

Now let's substitute integer codes for notes with corresponding embedding vectors.

In [None]:
import tensorflow as tf

batch_size = 128
embedding_size = 20 # Dimension of the embedding vector.
n_neurons = 256
n_layers = 

graph = tf.Graph()

with graph.as_default(), tf.device('/cpu:0'):

  # Input data.
  train_dataset = tf.placeholder(tf.int32, [batch_size, SEQLEN, 1])
  train_labels = tf.placeholder(tf.int32, shape=[batch_size, SEQLEN, 1])
  valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
  

  # Constant. We use pre-trained embeddings
  embeddings = tf.Constant(final_embeddings) #use this to continue learning from the saved checkpoint

  # Model.
  
  # Look up embeddings for inputs.
  input_embed = tf.nn.embedding_lookup(embeddings, train_dataset)
  
  layers = [tf.contrib.rnn.LSTMCell(n_neurons, activation = tf.nn.relu) for layer in range(n_layers)]
  multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
  outputs, states = tf.nn.dynamic_rnn(multi_layer_cell)


    

  loss = tf.reduce_mean(
    tf.nn.sampled_softmax_loss(weights=softmax_weights, biases=softmax_biases, inputs=embed,
                               labels=train_labels, num_sampled=num_sampled, num_classes=vocabulary_size))

  # Optimizer.
  # Note: The optimizer will optimize the softmax_weights AND the embeddings.
  # This is because the embeddings are defined as a variable quantity and the
  # optimizer's `minimize` method will by default modify all variable quantities 
  # that contribute to the tensor it is passed.
  # See docs on `tf.train.Optimizer.minimize()` for more details.
  optimizer = tf.train.AdamOptimizer(0.001).minimize(loss)
  
  # Compute the similarity between minibatch examples and all embeddings.
  # We use the cosine distance:
  norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
  normalized_embeddings = embeddings / norm
  valid_embeddings = tf.nn.embedding_lookup(
    normalized_embeddings, valid_dataset)
  similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))

In [161]:
for i in range(1,len(b)):
    print(permutate("abcd", b[i]))
    

abdc
acbd
acdb
adbc
adcb
bacd
badc
bcad
bcda
bdac
bdca
cabd
cadb
cbad
cbda
cdab
cdba
dabc
dacb
dbac
dbca
dcab
dcba


In [156]:
b

[(0, 1, 2, 3),
 (0, 1, 3, 2),
 (0, 2, 1, 3),
 (0, 2, 3, 1),
 (0, 3, 1, 2),
 (0, 3, 2, 1),
 (1, 0, 2, 3),
 (1, 0, 3, 2),
 (1, 2, 0, 3),
 (1, 2, 3, 0),
 (1, 3, 0, 2),
 (1, 3, 2, 0),
 (2, 0, 1, 3),
 (2, 0, 3, 1),
 (2, 1, 0, 3),
 (2, 1, 3, 0),
 (2, 3, 0, 1),
 (2, 3, 1, 0),
 (3, 0, 1, 2),
 (3, 0, 2, 1),
 (3, 1, 0, 2),
 (3, 1, 2, 0),
 (3, 2, 0, 1),
 (3, 2, 1, 0)]

In [36]:
words.pop()

'********'

In [220]:
from math import log2

In [221]:
log2(len(dictionary))

17.48772229515735

In [112]:
words

['aabb    ', '    hhii', '  kkll  ']

In [60]:
batch_size = 32
sequence_size = 64
data = np.array(codetext)
data_len = data.shape[0]
# using (data_len-1) because we must provide for the sequence shifted by 1 too
nb_batches = (data_len - 1) // (batch_size * sequence_size)
assert nb_batches > 0, "Not enough data, even for a single batch. Try using a smaller batch_size."
rounded_data_len = nb_batches * batch_size * sequence_size
xdata = np.reshape(data[0:rounded_data_len], [batch_size, nb_batches * sequence_size])
ydata = np.reshape(data[1:rounded_data_len + 1], [batch_size, nb_batches * sequence_size])

In [61]:
xdata

array([[     1,      1,      1, ...,    776,     96,    153],
       [    32,     92,    595, ...,     50,      6,      1],
       [     1,      1,      1, ...,    154,    154,     97],
       ..., 
       [     1,      1,      4, ...,      9,     36,     13],
       [     9,    145,    145, ...,    806,     76,     76],
       [     4,      4,     24, ..., 112439,   1087, 154318]])

In [62]:
xdata.shape

(32, 184768)

In [63]:
32*184768

5912576