<a href="https://colab.research.google.com/github/kumar-abhishek/handson-ml2/blob/master/Final_Chris_BachChorales_HandsOnCh15_2020_01_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
%tensorflow_version 2.x

In [0]:
import os

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras

In [0]:
# Python ≥3.5 is required
import sys
assert sys.version_info >= (3, 5)

# Scikit-Learn ≥0.20 is required
import sklearn
assert sklearn.__version__ >= "0.20"

try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
    IS_COLAB = True
except Exception:
    IS_COLAB = False

# TensorFlow ≥2.0 is required
import tensorflow as tf
from tensorflow import keras
assert tf.__version__ >= "2.0"

if not tf.test.is_gpu_available():
    print("No GPU was detected. LSTMs and CNNs can be very slow without a GPU.")
    if IS_COLAB:
        print("Go to Runtime > Change runtime and select a GPU hardware accelerator.")

# Common imports
import numpy as np
import os

# to make this notebook's output stable across runs
np.random.seed(42)
tf.random.set_seed(42)

> Ch15 Q10 
>
> Download the Bach chorales dataset and unzip it. It is composed of 382 chorales composed by Johann Sebastian Bach. Each chorale is 100 to 640 time steps long, and each time step contains 4 integers, where each integer corresponds to a note’s index on a piano (except for the value 0, which means that no note is played). Train a model—recurrent, convolutional, or both—that can predict the next time step (four notes), given a sequence of time steps from a chorale."

The Bach chorales are available from:

https://github.com/ageron/handson-ml2/blob/master/datasets/jsb_chorales/jsb_chorales.tgz

I downloaded them to my Google Drive. Then I mounted my Google Drive to Google Colab:

`Expand the left pane > Files > Mount Drive`

In [89]:
from google.colab import drive
import os
import numpy as np
import pandas as pd
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [0]:
def read_dir_of_chorales(dir):

  print("\nReading chorales in: " + dir + "\n---------------\n")

  chorales = []
  file_counter = 0
  for filename in sorted(os.listdir(dir)):
    file_counter += 1
    if (file_counter % 10) == 0:
      print(str(file_counter) + ") filename: " + filename)

    one_training_chorale = pd.read_csv(os.path.join(dir, filename)).to_numpy()
    chorales.append(one_training_chorale)

  return chorales

In [85]:
data_path = "/content/drive/My Drive/jsb_chorales"

train_path = os.path.join(data_path, 'train')
valid_path = os.path.join(data_path, 'valid')
test_path = os.path.join(data_path, 'test')

bach_training_chorales = read_dir_of_chorales(train_path)
bach_validation_chorales = read_dir_of_chorales(valid_path)
bach_test_chorales = read_dir_of_chorales(test_path)


Reading chorales in: /content/drive/My Drive/jsb_chorales/train
---------------

10) filename: chorale_009.csv
20) filename: chorale_019.csv
30) filename: chorale_029.csv
40) filename: chorale_039.csv
50) filename: chorale_049.csv
60) filename: chorale_059.csv
70) filename: chorale_069.csv
80) filename: chorale_079.csv
90) filename: chorale_089.csv
100) filename: chorale_099.csv
110) filename: chorale_109.csv
120) filename: chorale_119.csv
130) filename: chorale_129.csv
140) filename: chorale_139.csv
150) filename: chorale_149.csv
160) filename: chorale_159.csv
170) filename: chorale_169.csv
180) filename: chorale_179.csv
190) filename: chorale_189.csv
200) filename: chorale_199.csv
210) filename: chorale_209.csv
220) filename: chorale_219.csv

Reading chorales in: /content/drive/My Drive/jsb_chorales/valid
---------------

10) filename: chorale_238.csv
20) filename: chorale_248.csv
30) filename: chorale_258.csv
40) filename: chorale_268.csv
50) filename: chorale_278.csv
60) filename:

Sanity check for what the data looks like

In [91]:
chorale_counter = 0
for chorale in bach_training_chorales:
  chorale_counter += 1
  if chorale_counter > 3: break

  print("Chorale #" + str(chorale_counter))
  print("Shape: " + str(chorale.shape))
  print(chorale[:6])
  print(" ...\n")


Chorale #1
Shape: (192, 4)
[[74 70 65 58]
 [74 70 65 58]
 [74 70 65 58]
 [74 70 65 58]
 [75 70 58 55]
 [75 70 58 55]]
 ...

Chorale #2
Shape: (228, 4)
[[69 64 61 57]
 [69 64 61 57]
 [69 64 61 57]
 [69 64 61 57]
 [71 64 59 56]
 [71 64 59 56]]
 ...

Chorale #3
Shape: (208, 4)
[[67 62 59 55]
 [67 62 59 55]
 [67 62 59 55]
 [67 62 59 55]
 [67 64 60 48]
 [67 64 60 48]]
 ...



> Ch 15 Q 10
>
> Train a model--recurrent, convolutional, or both--that can predict the next time step (four notes), given a sequence of time steps from a chorale.

### Approaches

Hmmm, three approaches seem reasonable to this problem:

1) We could treat this as a regression problem where we try to predict the next note value along the real line. In this case, Mean Squared Error (MSE) seems like the most reasonable metric to use. (Note that MSE is the cross-entropy when we assume that there is gaussian noise added to each output value.)

2) But in "music space" neighboring notes can actually be further apart than notes from the same chord or key, it may make more sense to treat each note as a separate class and measure performance as the cross-entropy of the "multi-noulli" distribution. 

3) We could design a custom loss that accounts for our knowledge of music theory. A loss that directly accounts for deviations from the chord or key.

A custom loss would take a lot of time to implement. Plus, our model may well learn most of the music theory simply by observing the note combinations in the chorales. So let's rule out option 3.

Since the notes are represented as numbers, regression looks like the obvious choice, but with my knowledge of music I am inclined toward option 2. I will start with option 2 and revisit option 1 if I have time.

### The multiple outputs

Another wrinkle in this task is that there are 4 outputs (4 notes) at each time step. With enough training data, we might be able to treat each separate combination of 4 notes as a separate class, but for this exercise my loss will just be the sum of the losses across the 4 notes. 

(Another idea would be to train a GAN style loss function that rates each 4 note combination on it's likelihood. We could create training data of random combinations of notes and train a classifier to distinguish them from real note combinations. But again, too complicated for this first stab.)

First, let's gather some baseline metrics

### Multinouli MLE Loss

So, our first model will be a classifier with average MLE (aka cross-entropy) loss across the 4 notes.

But FIRST, I need to build a model with **1** input and **1** output for the sake of sanity.

In [92]:
def split_off_bass(chorales):

  bach_chorales_minus_last_notes = []
  last_notes = []
  next_to_last_notes = []
  bach_chorales_bass_last_notes = []
  bass_next_to_last = []
  for chorale in chorales:
    bach_chorales_minus_last_notes.append(chorale[:-1])
    if len(chorale)>0: 
      last_notes.append(chorale[-1])
    if len(next_to_last_notes)>1:
      next_to_last_notes.append(chorale[-2])
    if len(chorale)>0:
      bach_chorales_bass_last_notes.append(chorale[1:])
    if len(chorale)>1:
      bass_next_to_last.append([chorale[-2, 0]])

  return np.array(bach_chorales_minus_last_notes), np.array(last_notes), np.array(next_to_last_notes), np.array(bach_chorales_bass_last_notes), np.array(bass_next_to_last)

training_minus_last_notes, training_last_notes, training_next_to_last_notes, bach_training_chorales_bass_last_notes, training_bass_next_to_last = split_off_bass(bach_training_chorales)
validation_minus_last_notes, validation_last_notes, validation_next_to_last_notes, bach_validation_chorales_bass_last_notes, validation_bass_next_to_last = split_off_bass(bach_validation_chorales)
test_minus_last_notes, test_last_notes, test_next_to_last_notes, bach_test_chorales_bass_last_notes, test_bass_next_to_last = split_off_bass(bach_test_chorales)

print("len(bach_training_choraels[0]): ", len(bach_training_chorales[0]))
print("len(training_minus_last_notes[0]): ", len(training_minus_last_notes[0]))
print("bach_training_chorales[0][-1]: ", bach_training_chorales[0][-1])
print("training_minus_last_notes[0][-1]: ", training_minus_last_notes[0][-1])

print("training_last_notes.shape: ", training_last_notes.shape)
print("training_last_notes[0]: ", training_last_notes[0])
print("bach_training_chorales_bass_last_notes.shape: ", bach_training_chorales_bass_last_notes.shape)
print("bach_training_chorales_bass_last_notes[0]: ", bach_training_chorales_bass_last_notes[0][0:10])


len(bach_training_choraels[0]):  192
len(training_minus_last_notes[0]):  191
bach_training_chorales[0][-1]:  [70 65 62 46]
training_minus_last_notes[0][-1]:  [70 65 62 46]
training_last_notes.shape:  (229, 4)
training_last_notes[0]:  [70 65 62 46]
bach_training_chorales_bass_last_notes.shape:  (229,)
bach_training_chorales_bass_last_notes[0]:  [[74 70 65 58]
 [74 70 65 58]
 [74 70 65 58]
 [75 70 58 55]
 [75 70 58 55]
 [75 70 60 55]
 [75 70 60 55]
 [77 69 62 50]
 [77 69 62 50]
 [77 69 62 50]]


In [0]:
training_minus_last_notes_padded = tf.keras.preprocessing.sequence.pad_sequences(training_minus_last_notes, padding='post', maxlen=576)
validation_minus_last_notes_padded =  tf.keras.preprocessing.sequence.pad_sequences(validation_minus_last_notes, padding='post', maxlen=576)
bach_training_chorales_bass_last_notes_padded =  tf.keras.preprocessing.sequence.pad_sequences(bach_training_chorales_bass_last_notes, padding='post', maxlen=576)
bach_validation_chorales_bass_last_notes_padded =  tf.keras.preprocessing.sequence.pad_sequences(bach_validation_chorales_bass_last_notes, padding='post', maxlen=576)

In [94]:
bach_training_chorales_bass_last_notes_padded.shape, training_minus_last_notes_padded.shape

((229, 576, 4), (229, 576, 4))

In [95]:
bach_validation_chorales_bass_last_notes.shape, validation_minus_last_notes_padded.shape

((76,), (76, 576, 4))

In [96]:
print(training_minus_last_notes_padded[0:2])

[[[74 70 65 58]
  [74 70 65 58]
  [74 70 65 58]
  ...
  [ 0  0  0  0]
  [ 0  0  0  0]
  [ 0  0  0  0]]

 [[69 64 61 57]
  [69 64 61 57]
  [69 64 61 57]
  ...
  [ 0  0  0  0]
  [ 0  0  0  0]
  [ 0  0  0  0]]]


In [97]:
#from keras.models import Sequential
from keras import optimizers

n_features = 4

model = keras.models.Sequential()

model.add(keras.layers.TimeDistributed(keras.layers.Dense(128), input_shape=(None, n_features)) ) # This line makes a lot of difference but why?
model.add(keras.layers.LSTM(64, input_shape=(None, n_features), return_sequences=True))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dropout(0.2))

model.add(keras.layers.LSTM(32, return_sequences=True))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(4))
model.compile(optimizer='adam', loss='mse')

# fit model
model.fit(training_minus_last_notes_padded, bach_training_chorales_bass_last_notes_padded, epochs=500, validation_data=(validation_minus_last_notes_padded, bach_validation_chorales_bass_last_notes_padded))

Train on 229 samples, validate on 76 samples
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Ep

<tensorflow.python.keras.callbacks.History at 0x7f098d802940>

In [99]:
# demonstrate prediction
x_input = training_minus_last_notes_padded[0][5:40]
x_input = x_input.reshape((1, len(x_input), 4))
print(x_input)
yhat = model.predict(x_input, verbose=0)
print(yhat[0][-1])
print('expected: ', training_minus_last_notes_padded[0][41])

[[[75 70 58 55]
  [75 70 60 55]
  [75 70 60 55]
  [77 69 62 50]
  [77 69 62 50]
  [77 69 62 50]
  [77 69 62 50]
  [77 70 62 55]
  [77 70 62 55]
  [77 69 62 55]
  [77 69 62 55]
  [75 67 63 48]
  [75 67 63 48]
  [75 69 63 48]
  [75 69 63 48]
  [74 70 65 46]
  [74 70 65 46]
  [74 70 65 46]
  [74 70 65 46]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [72 69 65 53]
  [74 70 65 46]
  [74 70 65 46]
  [74 70 65 46]
  [74 70 65 46]
  [75 69 63 48]
  [75 69 63 48]
  [75 67 63 48]
  [75 67 63 48]]]
[70.16625  65.31986  60.064274 46.48195 ]
expected:  [77 65 62 50]


Q10[part-2] Then use this model to generate Bach-like music, one note at a time: you can do this by giving the model the start of a chorale and asking it to predict the next time step, then appending these time steps to the input sequence and asking the model for the next note, and so on. Also make sure to check out Google’s Coconet model, which was used for a nice Google doodle about Bach.



In [117]:
import random
n_steps = None
# convert into input/output
i=1
n=1
x_input = np.array([training_minus_last_notes_padded[5][0]]) #using 5th chorale

x_input = x_input.reshape((1, len(x_input), n_features))
while i<len(training_minus_last_notes_padded[5]):
  # demonstrate prediction
  print('Input: ', x_input)
  print('---------------')
  yhat = model.predict(x_input, verbose=1)
  output = np.array([yhat[0][-1]])
  print('out:',output)
  for j in range(len(output[0])):
    output[0][j] = int(output[0][j] + random.random())
  print('Predicted Output: ', output)
  print('expected: ', bach_training_chorales_bass_last_notes_padded[5][i])
  print('\n\n')

  output = output.reshape((1, len(output), n_features))
  x_input = np.concatenate([x_input, output], axis=1)
  i += 1
  if i>20:
    break

Input:  [[[71 66 62 47]]]
---------------
out: [[67.91701  62.819077 59.154694 46.598526]]
Predicted Output:  [[68. 63. 59. 46.]]
expected:  [71 66 62 47]



Input:  [[[71. 66. 62. 47.]
  [68. 63. 59. 46.]]]
---------------
out: [[67.15555  62.926598 58.29028  45.984703]]
Predicted Output:  [[67. 63. 58. 46.]]
expected:  [71 66 62 47]



Input:  [[[71. 66. 62. 47.]
  [68. 63. 59. 46.]
  [67. 63. 58. 46.]]]
---------------
out: [[66.06477  62.32431  57.541103 45.240593]]
Predicted Output:  [[66. 63. 58. 45.]]
expected:  [71 67 64 52]



Input:  [[[71. 66. 62. 47.]
  [68. 63. 59. 46.]
  [67. 63. 58. 46.]
  [66. 63. 58. 45.]]]
---------------
out: [[63.933865 60.612312 56.071384 43.514526]]
Predicted Output:  [[64. 60. 56. 44.]]
expected:  [71 67 64 52]



Input:  [[[71. 66. 62. 47.]
  [68. 63. 59. 46.]
  [67. 63. 58. 46.]
  [66. 63. 58. 45.]
  [64. 60. 56. 44.]]]
---------------
out: [[61.947475 58.686172 54.17954  42.34578 ]]
Predicted Output:  [[62. 59. 55. 43.]]
expected:  [71 67 64 5