# <center>Generating Music Using LSTM Cells</center>

This workbook will implement modified code from [this](https://github.com/corynguyen19/midi-lstm-gan) GitHub repo.

The idea is to read in MIDI files and convert them to arrays of notes. Then an RNN will be trained to predict the next note. Finally, music is generated by feeding a random string of notes to the RNN and having it iteratively predict the next note to form a song one note at a time.

### Things to test tomorrow:

* Add batch normalization - Loss won't go below ~3
* Add an extra layer (RNN and dense) - Meh
* Normalize between -1 to 1 - Didn't do much
* Decrease step size - MUCH better results with step size of 1
* Mess with batch size - Smaller seems better
* Mash-up 2 songs? - Works surprisingly well
* Converges? - No, different inputs should produce different tracks using the same model
* Try transfer learning and training the last few layers on this data?


### Visualizations
Cross correlations

In [1]:
# Imports
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time
import os
from music21 import converter, instrument, note, chord, stream, duration
from keras.models import Sequential
from keras.models import load_model
from keras.callbacks import Callback, ModelCheckpoint
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import CuDNNLSTM, LSTM, Bidirectional
from keras.layers import Activation
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint, History

Using TensorFlow backend.


## Loading and Cleaning the Data

First, I will load all the notes from the midi files

In [2]:
def get_notes(path):
    """
        Gets all notes and chords from midi files
    """
    notes = []

    for file in glob.glob(path + "*.mid"):        
        song = []
        midi = converter.parse(file)
        
        print("Parsing %s" % file)
        
        notes_to_parse = None

        try: # file has instrument parts
            s2 = instrument.partitionByInstrument(midi)
            notes_to_parse = s2.parts[0].recurse() 
        except: # file has notes in a flat structure
            notes_to_parse = midi.flat.notes

        for element in notes_to_parse:
            if isinstance(element, note.Note):
                song.append([str(element.pitch), element.offset, element.duration])
            elif isinstance(element, chord.Chord):
                song_note = '.'.join(str(n) for n in element.normalOrder)
                song.append([song_note, element.offset, element.duration])
        notes.append(song)

    return notes

def get_notes_with_key(path, filter_key, mode):
    """
        Gets all notes and chords from midi files
    """
    notes = []

    for file in glob.glob(path + "*.mid"):        
        song = []
        midi = converter.parse(file)
        
#         Only use music of the same key
        key = midi.analyze('key')
        if(mode==0):
            key_string = str(key.tonic.name)
        elif(mode==1):
            key_string = str(key.mode)
        else:
            key_string = str(key.tonic.name + key.mode)
            
        if(key_string==filter_key):
            print("Parsing %s" % file)

            notes_to_parse = None

            try: # file has instrument parts
                s2 = instrument.partitionByInstrument(midi)
                notes_to_parse = s2.parts[0].recurse() 
            except: # file has notes in a flat structure
                notes_to_parse = midi.flat.notes

            for element in notes_to_parse:
                if isinstance(element, note.Note):
                    song.append([str(element.pitch), element.offset, element.duration])
                elif isinstance(element, chord.Chord):
                    song_note = '.'.join(str(n) for n in element.normalOrder)
                    song.append([song_note, element.offset, element.duration])
            notes.append(song)

    return notes

# def get_notes_with_key(path, filter_key, mode):
#     """
#         Gets all notes and chords from midi files where the key matches the string input

#         Parameters
#         ----------
#         path : str
#             The path to the file
#         filter_key : str
#             The string to filter the key on
#         mode : int
#             The type of key used where:
#                 0 - key
#                 1 - major/minor
#                 else - key and major/minor
#     """
#     notes = []

#     for file in glob.glob(path + "*.mid"):        
#         song = []
#         midi = converter.parse(file)
        
#         # Only use music of the same key
#         key = midi.analyze('key')
#         if(mode==0):
#             key_string = str(key.tonic.name)
#         elif(mode==1):
#             key_string = str(key.mode)
#         else:
#             key_string = str(key.tonic.name + key.mode)
            
#         if(key_string==filter_key):
#             print("Parsing %s" % file)
#             notes_to_parse = None

#             try: # file has instrument parts
#                 s2 = instrument.partitionByInstrument(midi)
#                 notes_to_parse = s2.parts[0].recurse() 
#             except: # file has notes in a flat structure
#                 notes_to_parse = midi.flat.notes

#             for element in notes_to_parse:
#                 if isinstance(element, note.Note):
#                     song.append(str(element.pitch))
#                 elif isinstance(element, chord.Chord):
#                     song.append('.'.join(str(n) for n in element.normalOrder))

#     return notes

""" Train a Neural Network to generate music """
# Get notes from midi files
input_dir_choice = 0
input_dir_names = ["Pokemon", "LoZ", "Pokemon GSC", "ABBA"]

input_path = "../" + input_dir_names[input_dir_choice] + " MIDIs/"
# example of each mode: 0 - C, 1 - major, 2 - Cmajor
notes = get_notes_with_key(input_path, "minor", 1)
# notes = get_notes(input_path)

Parsing ../Pokemon MIDIs\Pokemon - Farewell, Pikachu!.mid
Parsing ../Pokemon MIDIs\Pokemon - Lugias Song.mid
Parsing ../Pokemon MIDIs\Pokemon - Oracion.mid
Parsing ../Pokemon MIDIs\Pokemon - Pallet Town.mid
Parsing ../Pokemon MIDIs\Pokemon - The Ghost at Maiden's Peak.mid
Parsing ../Pokemon MIDIs\Pokemon Black 2White 2 - Ns Room.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Battle CherenBianca.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Battle Elite Four.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Battle Team Plasma.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Champion Alder.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Driftveil City.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Ending Onward to Our Own Futures.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Mistralton City.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - N the Pokemon Child.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Ns Castle.mid
Parsing ../Pokemon MIDIs\Pokemon BlackWhite - Pok

I will now use an algo to determine the key of each song

In [4]:
def print_key(path):
    key_count = dict()
    for file in glob.glob(path + "*.mid"):
        print("Parsing %s" % file)
        
        song = []
        midi = converter.parse(file)
        
        key = midi.analyze('key')
        key_string = key.tonic.name + key.mode
        if (key_string in key_count): 
            key_count[key_string] += 1
        else: 
            key_count[key_string] = 1
        print(key.tonic.name, key.mode)
    return key_count

input_dir_choice = 2
input_dir_names = ["Pokemon", "LoZ", "Pokemon GSC", "ABBA"]

input_path = "../" + input_dir_names[input_dir_choice] + " MIDIs/"
key_count = print_key(input_path)
# key_count = print_key("../test MIDIs/")
key_count

Parsing ../Pokemon GSC MIDIs\Pokemon Gold, Silver, Crystal - Cinnabar Island (HGSS Version).mid
G major
Parsing ../Pokemon GSC MIDIs\Pokemon Gold, Silver, Crystal - S.S. Aqua .mid
G major
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Azalea TownBlackthorn City.mid
C# major
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Bicycle.mid
E minor
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Bug Catching Contest.mid
E minor
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Burned Tower.mid
E minor
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Champion Battle.mid
G# minor
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Cherrygrove CityMahogany Town.mid
F major
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Dance Theatre.mid
A minor
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Dark Cave.mid
A- major
Parsing ../Pokemon GSC MIDIs\Pokemon GoldSilverCrystal - Dragons Den.mid
C# minor
Parsing ../Pokemon GSC MIDIs\Pokemon

{'Gmajor': 5,
 'C#major': 1,
 'Eminor': 3,
 'G#minor': 3,
 'Fmajor': 3,
 'Aminor': 3,
 'A-major': 4,
 'C#minor': 3,
 'Cmajor': 8,
 'Emajor': 4,
 'Dmajor': 8,
 'B-minor': 4,
 'E-major': 3,
 'Bmajor': 2,
 'Amajor': 2,
 'Fminor': 1,
 'Cminor': 1,
 'Dminor': 1,
 'F#major': 1,
 'E-minor': 1}

Next, I will find all possible notes and use this to determine how to alter the data to a machine readable format.

In [3]:
possibleNotes = set([item[0] for sublist in notes for item in sublist])

# Processing for offsets
possibleOffsets = []
possibleDurations = []

# For each song
for index, song in enumerate(notes):
    song_length = len(song)
    
    # For each note, calculate the difference in offset between this and the previous note
    song_offsets = []
    song_durations = []
    for idx in range(song_length):
        offset = offset = round(song[idx][1] - song[idx - 1][1], 3) if idx > 1 else 0.0
        song_offsets.append(offset)
        if offset not in possibleOffsets:
            possibleOffsets.append(offset)
        
        duration = song[idx][2].quarterLength
        song_durations.append(duration)
        if duration not in possibleDurations:
            possibleDurations.append(duration)
            
    # Update the notes to reflect this
    for idx in range(song_length):
        notes[index][idx][1] = song_offsets[idx]
        notes[index][idx][2] = song_durations[idx]

n_notes = len(possibleNotes)
n_offset = len(possibleOffsets)
n_duration = len(possibleDurations)


possibleNotes = np.array(list(possibleNotes))
possibleOffsets = np.array(list(possibleOffsets))
possibleDurations = np.array(list(possibleDurations))
notes = np.array([list([list(subsublist) for subsublist in sublist]) for sublist in notes])
len(possibleNotes), len(possibleOffsets), len(possibleDurations)

(394, 45, 52)

Now I will prepare the sequences of notes by looking at each song individually. I will first grab an arrays of size **sequence_length** with a stride of **step_size** from each song. Then I will map the chords to integers so the model can learn from that and normalize the input between 0-1.

In [5]:
def prepare_sequences(notes, possibleNotes, possibleOffsets, possibleDurations):
    """ Prepare the sequences used by the Neural Network """
    sequence_length = 100
    step_size = 1

    # create a dictionary to map pitches to integers
    pitchnames = sorted(possibleNotes)
    note_to_int = dict((note, number) for number, note in enumerate(pitchnames))
    
    # create a dictionary to map offset to integers
    offsetnames = sorted(possibleOffsets)
    offset_to_int = dict((offset, number) for number, offset in enumerate(offsetnames))
    
    # create a dictionary to map duration to integers
    durationnames = sorted(possibleDurations)
    duration_to_int = dict((duration, number) for number, duration in enumerate(durationnames))
    
    # find number of each possible choice for normalization
    n_notes = len(possibleNotes)
    n_offset = len(possibleOffsets)
    n_duration = len(possibleDurations)

    network_input = []
    network_output_notes = []
    network_output_offset = []
    network_output_duration = []


    # create input sequences and the corresponding outputs
    for song in notes:
        for i in range(0, len(song) - sequence_length, step_size):
            sequence_in = song[i:i + sequence_length]
            sequence_out = song[i + sequence_length]
            network_input.append([np.array([note_to_int[row[0]] / float(n_notes), offset_to_int[row[1]] / float(n_offset), duration_to_int[row[2]] / float(n_duration)]) for row in sequence_in])
            network_output_notes.append(np.array([note_to_int[sequence_out[0]]]))
            network_output_offset.append(np.array([offset_to_int[sequence_out[1]]]))
            network_output_duration.append(np.array([duration_to_int[sequence_out[2]]]))


    # reshape the input into a format compatible with LSTM layers
    n_patterns = len(network_input)
    network_input = np.reshape(network_input, (n_patterns, sequence_length, 3))

    # Make one-hot-encoding
    network_output_notes = np_utils.to_categorical(network_output_notes, num_classes=n_notes)
    network_output_offset = np_utils.to_categorical(network_output_offset, num_classes=n_offset)
    network_output_duration = np_utils.to_categorical(network_output_duration, num_classes=n_duration)


    return (network_input, network_output_notes, network_output_offset, network_output_duration)

network_input, network_output_notes, network_output_offset, network_output_duration = prepare_sequences(notes, possibleNotes, possibleOffsets, possibleDurations)
network_input.shape

(55548, 100, 3)

## Constructing the model

I will now construct the model using CuDNNLSTM cells because they are significantly faster than regular LSTM cells due to being optimized for CuDA. I will have two CuDNNLSTM layers, followed by two dense layers and a final softmax activation layer to output the most probable result.

Hyperparameters:
* Optimizer - ADAM because it is considered one of the best
* Loss - categorical_crossentropy because it penalizes wrong predictions of multi-class problems best
* Epochs - More epochs are generally better as long as they don't overfit. I track loss over time and have checkpoints every 5 epochs so this will not be a problem
* Batch Size - This determines how many instances should be considered in each batch. Realistically, each different song would interfere with the other so I will reduce this.
    * Smaller Batch Size seems to add more variation

In [6]:
from keras import Input
from keras.models import Model

def create_network(network_input, n_notes, n_offset, n_duration):
    """ create the structure of the neural network """
#     model = Sequential()
#     model.add(CuDNNLSTM(512, input_shape=(network_input.shape[1], network_input.shape[2]), return_sequences=True))
#     model.add(Dropout(0.3))
#     model.add(Bidirectional(CuDNNLSTM(512, return_sequences=True)))
#     model.add(Dropout(0.3))
#     model.add(Bidirectional(CuDNNLSTM(512)))
#     model.add(Dense(256))
#     model.add(Dropout(0.3))
#     model.add(Dense(3, input_shape=(n_notes, n_offset, n_duration)))
    
    input = Input(shape=(network_input.shape[1], network_input.shape[2]))
    
    lstm_1 = CuDNNLSTM(512, input_shape=(network_input.shape[1], network_input.shape[2]), return_sequences=True)(input)
    dropout_1 = Dropout(0.3)(lstm_1)
    lstm_2 = Bidirectional(CuDNNLSTM(512, return_sequences=True))(dropout_1)
    dropout_2 = Dropout(0.3)(lstm_2)
    lstm_3 = Bidirectional(CuDNNLSTM(512))(dropout_2)
    dense_1 = Dense(256)(lstm_3)
    dropout_3 = Dropout(0.3)(dense_1)
    dense_2 = Dense(3, input_shape=(n_notes, n_offset, n_duration))(dropout_3)
    output_notes = Dense(n_notes, activation='softmax')(dropout_3)
    output_offset = Dense(n_offset, activation='softmax')(dropout_3)
    output_duration = Dense(n_duration, activation='softmax')(dropout_3)
    
    model = Model(inputs=input, outputs=[output_notes, output_offset, output_duration])
    
    model.compile(loss=['categorical_crossentropy', 'categorical_crossentropy', 'categorical_crossentropy'], optimizer='adam', loss_weights=[1., 1., 1.])
    return model

# Set up the model
model = create_network(network_input, n_notes, n_offset, n_duration)
history = History()

# Save on each epoch (because training isn't cheap!!!) and can use this to generate music for each checkpoint
outputDest = '../output/LSTM_' + input_dir_names[input_dir_choice] + '_' + str(int(time.time())) + '/'
if not os.path.exists(outputDest):
    os.makedirs(outputDest)

cp_callback = ModelCheckpoint(filepath=outputDest + "LSTMmodel_weights_{epoch:02d}.hdf5",
                              save_weights_only=True,
                              verbose=1,
                              period=3)

# Set parameters
n_epochs = 100
batch_size = 40
model.summary()

W1008 21:50:16.960594  1336 deprecation_wrapper.py:119] From C:\Users\Michael\Anaconda3\envs\CITS4404\lib\site-packages\keras\backend\tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W1008 21:50:16.991863  1336 deprecation_wrapper.py:119] From C:\Users\Michael\Anaconda3\envs\CITS4404\lib\site-packages\keras\backend\tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W1008 21:50:19.479472  1336 deprecation_wrapper.py:119] From C:\Users\Michael\Anaconda3\envs\CITS4404\lib\site-packages\keras\backend\tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W1008 21:50:19.760782  1336 deprecation_wrapper.py:119] From C:\Users\Michael\Anaconda3\envs\CITS4404\lib\site-packages\keras\backend\tensorflow_backend.py:148: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 100, 3)       0                                            
__________________________________________________________________________________________________
cu_dnnlstm_1 (CuDNNLSTM)        (None, 100, 512)     1058816     input_1[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 100, 512)     0           cu_dnnlstm_1[0][0]               
__________________________________________________________________________________________________
bidirectional_1 (Bidirectional) (None, 100, 1024)    4202496     dropout_1[0][0]                  
____________________________________________________________________________________________

## Training the Model

I will save the final model, but keep checkpoints along the way to avoid overfitting and also use these to generate different midis.

In [7]:
model.fit(network_input, [network_output_notes, network_output_offset, network_output_duration], callbacks=[history, cp_callback], epochs=n_epochs, batch_size=batch_size)
model.save(outputDest + 'LSTMmodel_final.h5')

W1008 21:50:21.401052  1336 deprecation.py:323] From C:\Users\Michael\Anaconda3\envs\CITS4404\lib\site-packages\tensorflow\python\ops\math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


Epoch 1/100
Epoch 2/100
Epoch 3/100

Epoch 00003: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_03.hdf5
Epoch 4/100
Epoch 5/100
Epoch 6/100

Epoch 00006: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_06.hdf5
Epoch 7/100
Epoch 8/100
Epoch 9/100

Epoch 00009: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_09.hdf5
Epoch 10/100
Epoch 11/100
Epoch 12/100

Epoch 00012: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_12.hdf5
Epoch 13/100
Epoch 14/100
Epoch 15/100

Epoch 00015: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_15.hdf5
Epoch 16/100
Epoch 17/100
Epoch 18/100

Epoch 00018: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_18.hdf5
Epoch 19/100
Epoch 20/100
Epoch 21/100

Epoch 00021: saving model to ../output/LSTM_Pokemon_1570542620/LSTMmodel_weights_21.hdf5
Epoch 22/100
Epoch 23/100
Epoch 24/100

Epoch 00024: saving model to ../output/LSTM_Pokemon_1570542620/LST

KeyboardInterrupt: 

In [8]:
# Plot the model losses
pd.DataFrame(history.history).plot()
plt.savefig(outputDest + 'LSTM_Loss_per_Epoch.png', transparent=True)
plt.close()

## Generating Music

I will now use the model to generate music by feeding it a random string of notes and have it predict the next one, then have it predict the one after that until a full song has been generated.

In [17]:
def generate_notes(model, network_input, possibleNotes, possibleOffsets, possibleDurations):
    """ Generate notes from the neural network based on a sequence of notes """
    # create a dictionary to map pitches to integers
    pitchnames = sorted(possibleNotes)
    int_to_note = dict((number, note) for number, note in enumerate(pitchnames))
    
    # create a dictionary to map offset to integers
    offsetnames = sorted(possibleOffsets)
    int_to_offset = dict((number, offset) for number, offset in enumerate(offsetnames))
    
    # create a dictionary to map duration to integers
    durationnames = sorted(possibleDurations)
    int_to_duration = dict((number, duration) for number, duration in enumerate(durationnames))
    
    # find number of each possible choice for normalization
    n_notes = len(possibleNotes)
    n_offset = len(possibleOffsets)
    n_duration = len(possibleDurations)
    
    # choose a random point to start
    start = np.random.randint(0, len(network_input)-1)
    pattern = network_input[start]
    sequence_length = pattern.shape[0]
    n_dim = pattern.shape[1]
    
    prediction_output = []
    
    # generate 500 notes
    for note_index in range(500):
        prediction_input = np.reshape(pattern, (1, sequence_length, n_dim))
        prediction_input = prediction_input

        prediction = model.predict(prediction_input, verbose=0)
        
        note_int = np.argmax(prediction[0])
        note_normalized = note_int / float(n_notes)
        note = int_to_note[note_int]
        
        offset_int = np.argmax(prediction[1])
        offset_normalized = offset_int / float(n_offset)
        offset = int_to_offset[offset_int]
                
        duration_int = np.argmax(prediction[2])
        duration_normalized = duration_int / float(n_duration)
        duration = int_to_duration[duration_int]
        
        result = np.array([note_normalized, offset_normalized, duration_normalized])
        full_prediction = np.array([note, offset, duration])
        
        prediction_output.append(full_prediction)
        pattern = np.append(pattern, result)
        pattern = pattern[3:len(pattern)]
        
    print([str(x[0]) for x in prediction_output])
    
    return prediction_output

prediction_output = generate_notes(model, network_input, possibleNotes, possibleOffsets, possibleDurations)

['C6', '4.5.9', 'F5', 'A5', '2.5.9', 'F5', '0.5', 'C3', 'C5', 'E5', 'G5', '5.9.0', 'C3', 'B-5', '11.4', 'C3', '11.0.4', 'C3', '4.9', 'G2', '0.4.7', '9.11.2', 'B2', '9.2', 'B3', '9.2', 'B-2', 'B-2', '9.2', 'B-3', '9.2', 'B-2', 'D3', 'D5', 'F#5', 'A5', '1.2.6', 'A5', '2.6', 'D5', '6.9', 'A5', '11.2.6', 'A5', 'C6', '4.5.9', 'A5', 'F3', 'C6', '4.5.9', 'F5', 'A5', '2.5.9', 'F5', '0.5', 'C3', 'C5', 'E5', 'G5', '5.9.0', 'C3', 'B-5', '11.4', 'C3', '11.0.4', 'C3', '4.9', 'G2', '0.4.7', '9.11.2', '9.11.2', '8.11', 'E3', 'B2', 'E2', 'A2', '8.9.1', 'A2', 'E2', '6.9.1', '11.0.4', 'C3', '11.0.4', '9.0.4', '7.0', 'G2', '0.4.7', 'G2', '9.2', 'B2', '9.11.2', 'B2', '2.8', 'F#2', '9.11.2', '8.10.2', 'G5', 'G#5', 'F5', 'D5', 'A4', 'B-4', 'G#4', 'F4', 'D4', 'A2', 'A4', 'C#5', 'E5', '8.9.1', 'E5', 'E5', 'A4', '1.4', 'E5', '6.9.1', 'E5', 'G5', '11.0.4', 'E5', 'C3', 'G5', 'E5', 'D5', 'E5', '9.0.4', 'D5', '7.0', 'G2', 'G4', 'B4', 'D5', '7.0', 'C3', '2.8', 'D5', 'E6', 'D5', 'E6', 'B5', 'B2', 'A5', 'E2', 'E6', '

Next, I will create a midi using these notes and save to a file

In [19]:
from music21 import duration as D

def create_midi(prediction_output, filename):
    """ convert the output from the prediction to notes and create a midi file
        from the notes """
    offset = 0
    output_notes = []

    # create note and chord objects based on the values generated by the model
    count = 0
    for pattern in prediction_output:
        note_str = pattern[0]
        offset_str = pattern[1]
        duration_str = pattern[2]
        # pattern is a chord
        if ('.' in note_str) or note_str.isdigit():
            notes_in_chord = note_str.split('.')
            notes = []
            for current_note in notes_in_chord:
                new_note = note.Note(int(current_note))
                new_note.storedInstrument = instrument.Piano()
                notes.append(new_note)
            new_chord = chord.Chord(notes)
            new_chord.offset = offset
            new_note.duration = D.Duration(float(duration_str))
            output_notes.append(new_chord)
        # pattern is a note
        else:
            new_note = note.Note(note_str)
            new_note.offset = offset
            new_note.duration = D.Duration(float(duration_str))
            new_note.storedInstrument = instrument.Piano()
            output_notes.append(new_note)
        # increase offset each iteration so that notes do not stack
        offset += (float(prediction_output[count + 1][1])) if (count + 1 < len(prediction_output)) else 0
        count += 1

    midi_stream = stream.Stream(output_notes)
    midi_stream.write('midi', fp='{}.mid'.format(filename))
    
create_midi(prediction_output, outputDest + 'LSTM_output_final')
# create_midi(prediction_output, 'lavender')

Alternatively, I can run this script to convert all of the models into midi files and select my favourite from a much larger album.

In [14]:
# Have each model make a song
count = 0
filepaths = glob.glob(outputDest + "*.hdf5")
for model_path in filepaths:
    print("Composing from %s" % model_path)
    model.load_weights(model_path)
    prediction_output = generate_notes(model, network_input, possibleNotes, possibleOffsets, possibleDurations)
    create_midi(prediction_output, outputDest + 'LSTM_output_' + str(count))
    print(outputDest + 'LSTM_output_' + str(count))
    count += 1

Composing from ../output/LSTM_Pokemon_1570542620\LSTMmodel_weights_03.hdf5
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',

../output/LSTM_Pokemon_1570542620/LSTM_output_2
Composing from ../output/LSTM_Pokemon_1570542620\LSTMmodel_weights_12.hdf5
['10.3', '0', '0', '1.4', '1.4', '1.4', '3.6', '10.3', '1.4', '1.4', '1.4', '1.4', '1.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '11.4', '1.5', '1.4', '1.5', '1.5', '1.5', '10', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5', '1.5'

../output/LSTM_Pokemon_1570542620/LSTM_output_5
Composing from ../output/LSTM_Pokemon_1570542620\LSTMmodel_weights_21.hdf5
['F#2', '11.2.6', 'D3', '11.2.6', 'B2', 'F#2', 'B1', 'F#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'B2', 'F#2', 'B1', 'F#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'B2', 'F#2', 'B1', 'F#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'B2', 'F#2', 'F#1', 'C#2', 'F#2', '6.10.1', 'B-2', '6.10.1', 'F#2', 'B2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.6', 'F#2', 'C#2', 'F#1', 'C#2', 'F#2', '11.2.6', 'D3', '11.2.

../output/LSTM_Pokemon_1570542620/LSTM_output_8
Composing from ../output/LSTM_Pokemon_1570542620\LSTMmodel_weights_30.hdf5
['G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 'E-2', '1.6', 'F2', 'F#2', '0.5', 'C2', '10.3', 'F2', '1.6', 'E-2', '0.5', 'C2', 'G1', 