# Music Generator - TensorFlow Keras LSTM
Welcome to the Music Generator, a neural network powered by TensorFlow and Keras that specializes in MIDI file processing and music generation. This sophisticated tool employs Long Short-Term Memory (LSTM) networks to analyze MIDI files, interpret notes (including chords), durations, and offsets, and subsequently generate new MIDI compositions based on these musical elements. These tasks are achieved through an elaborate training process, where the model learns to categorize and understand various notes, chords, durations, and offsets found in input MIDI files.

## Purpose
This project originated from a collaboration with a friend who frequently arranges music for a cappella groups. The primary objective was to assist her in exploring new musical possibilities by inputting her arrangements and assessing the unique compositions this model could generate. Our training dataset comprises 41 of her MIDI-based arrangements, condensed into a single track to simplify chord grouping. After 100 epochs of training, we achieved remarkable results, reducing the total loss from an initial 10.8136 to an impressive 0.9185. This overall loss can be further broken down into the following components:
- Notes/Chords Loss: 0.5928
- Offset Loss: 0.1223
- Duration Loss: 0.2034

## Usage
This Music Generator can be executed in two primary modes: training with your own MIDI files or generating music using pre-trained weights. The specific configuration for each mode is detailed in the "Input Variables" section at the top of the MusicGeneratorFinal.ipynb file.

### Definitions
- Note: single pitch
- Chord: multiple pitches at once (will be used interchangeably with note throughout)
- Offset: time between one note/chord and the next
- Duration: length of note/chord

## How to Run
Before proceeding, make the necessary adjustments only in the "Input Variables" section of the code:
1. Update "folder_midi" to specify the folder location containing your MIDI files for analysis.
2. Set "num_notes_to_generate" to indicate the number of notes you wish to generate in your new composition.
3. Specify "file_midi_output" and "folder_midi_output" to determine the location and filename for the generated MIDI output.
4. Set "pretrained" to either True or False depending on whether you have previously trained the model. Refer to the subsequent sections for guidance on configuring your choice.

### Training
If you intend to use this tool with new MIDI files to explore your own musical creations, follow these steps:
1. Adjust "folder_data" and "folder_weights" to specify the locations where you want to save your data and weights files.
2. Modify "num_epochs" to specify the desired number of training epochs. Keep in mind that longer training times may be required for higher accuracy. Running with GPUs is recommended for acceleration.

### Pre-Trained
If you have already trained the model and wish to generate new MIDIs without retraining, proceed as follows:
1. Update "file_notes," "file_durations," and "file_offsets" with the file paths to your notes, durations, and offsets files.
2. Specify "file_weights" with the file path to your pre-trained weights file in HDF5 format. It is recommended to use the weights file with the lowest loss for the best results.

After configuring the above settings, execute the code to generate your new MIDI composition. Please exercise caution when specifying file locations, as existing files will be overwritten during the process. Also, any write folder locations not already existing should be created in the section after Input Variables.

If you are missing any packages, please uncomment the following line and install all packages in the included requirements.txt file. This was created using Python version 3.10.9.

In [1]:
# pip install -r requirements.txt 

In [2]:
#Import packages
import glob
import os
import pickle
import numpy
import tensorflow.keras.utils as np_utils
from music21 import converter, instrument, note, chord, stream
from tensorflow.keras import Model
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.layers import Activation, BatchNormalization, Bidirectional, concatenate, Dense, Dropout, Input, LSTM
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam

### Input Variables
In this section, the user can define file locations and let the program know if the model has been pre-trained where to find the weights if so. You can also decide the number of epochs and the number of notes to generate. See top section or README.md for instructions of how to update this section.

In [12]:
# General Variables
folder_midi = 'input_midis/'
num_notes_to_generate = 400
folder_midi_output = 'output/'
file_midi_output = folder_midi_output + 'output_midi.mid'
pretrained = True

# Training
if not pretrained:
    folder_data = 'data/'
    folder_weights = 'weights/'
    num_epochs = 100

# Pre-Trained
if pretrained:
    file_notes = 'data/notes_sample'
    file_durations = 'data/durations_sample'
    file_offsets = 'data/offsets_sample'
    file_weights = 'weights/weights-sample-100-0.9185.hdf5'

Create output folders if they do not already exist.

In [13]:
if not pretrained:
    if not os.path.exists(folder_data): 
        os.makedirs(folder_data)
        print('Created folder: ' + folder_data)
    if not os.path.exists(folder_weights): 
        os.makedirs(folder_weights)
        print('Created folder: ' + folder_weights)
    if not os.path.exists(folder_midi_output): 
        os.makedirs(folder_midi_output)
        print('Created folder: ' + folder_midi_output)

### Parse MIDI Files
If the model has not been pre-trained, parse through each track in each MIDI file to gather notes (including chords), durations of notes, and offsets between notes using the Music21 package. Otherwise import the data for notes, durations, and offsets. 

In [14]:
if pretrained: 
	with open(file_notes, 'rb') as filepath:
		all_notes = pickle.load(filepath)
	
	with open(file_durations, 'rb') as filepath:
		all_durations = pickle.load(filepath)
	
	with open(file_offsets, 'rb') as filepath:
		all_offsets = pickle.load(filepath)

else: 
    all_notes = []
    all_durations = []
    all_offsets = []
    all_midi_files = []

    for file in glob.glob(folder_midi + "*.mid"):
        print('Parsing: ' + file)
        try:
            midi_parsed = converter.parse(file)
            parts_to_parse = []

            # Check if file has multiple tracks/parts or if file is flat 
            try: 
                instr = instrument.partitionByInstrument(midi_parsed)
                # for each track in the instrument part, add to list
                for i in range(0, len(instr.parts)):
                    parts_to_parse = parts_to_parse.append(instr.parts[0].recurse())
            except: 
                parts_to_parse = [midi_parsed.flat.notes]
            
            # Parse all elements in each part
            base_offset = 0
            for part in parts_to_parse:
                for element in part:

                    # note/chord
                    if isinstance(element, note.Note):
                        pitch = element.pitch
                        all_notes.append(str(pitch))
                    elif isinstance(element, chord.Chord):
                        chord_parts = '.'.join(str(n) for n in element.normalOrder)
                        all_notes.append(chord_parts)
                    
                    # durations
                    duration = str(element.duration.quarterLength)
                    all_durations.append(str(duration))

                    # offset
                    offset = element.offset
                    adjusted_offset = element.offset - base_offset
                    all_offsets.append(str(adjusted_offset))
                    base_offset = element.offset
            all_midi_files.append(file)
        except: print('Failed to parse: ' + file)
                

    with open(folder_data + 'notes', 'wb') as filepath: pickle.dump(all_notes, filepath)
    with open(folder_data + 'durations', 'wb') as filepath: pickle.dump(all_durations, filepath)
    with open(folder_data + 'offsets', 'wb') as filepath: pickle.dump(all_offsets, filepath)

### Function to Create Sequences
This function is used to create sequences for notes/chords, durations, and offsets.

In [15]:
def create_sequences(elements, vocab_length):
	seq_length = 100
	elementnames = sorted(set(item for item in elements))
	notes_as_ints = dict((note, num) for num, note in enumerate(elementnames))

	neural_network_input = []
	neural_network_output = []
	for i in range(0, len(elements) - seq_length, 1):
		sequence_in = elements[i:i + seq_length]
		sequence_out = elements[i + seq_length]
		neural_network_input.append([notes_as_ints[char] for char in sequence_in])
		neural_network_output.append(notes_as_ints[sequence_out])

	num_patterns = len(neural_network_input)

	# reshape the input into a format compatible with LSTM layers
	normalized_input = numpy.reshape(neural_network_input, (num_patterns, seq_length, 1))
	# normalize input
	normalized_input = normalized_input / float(vocab_length)

	neural_network_output = np_utils.to_categorical(neural_network_output)

	return (neural_network_input, normalized_input, neural_network_output)

Next, we use the above function to create those sequences and get the number of notes/chords, durations, and offsets we have.

In [16]:
vocab_length_notes = len(set(all_notes))
vocab_length_offsets = len(set(all_offsets))
vocab_length_durations = len(set(all_durations))
print('Number of notes/chords: ' + str(vocab_length_notes))
print('Number of offsets: ' + str(vocab_length_offsets))
print('Number of durations: ' + str(vocab_length_durations))

neural_network_input_notes, normalized_input_notes, neural_network_output_notes = \
    create_sequences(all_notes, vocab_length_notes)

neural_network_input_offsets, normalized_input_offsets, neural_network_output_offsets = \
    create_sequences(all_offsets, vocab_length_offsets)

neural_network_input_durations, normalized_input_durations, neural_network_output_durations = \
    create_sequences(all_durations, vocab_length_durations)

Number of notes/chords: 515
Number of offsets: 118
Number of durations: 37


### Create Model

In [17]:

# Create layers for input of neural network for notes/chords, durations, and offsets
layer_inputNotes = Input(shape=(normalized_input_notes.shape[1], normalized_input_notes.shape[2]))
layer_inputOffsets = Input(shape=(normalized_input_offsets.shape[1], normalized_input_offsets.shape[2]))
layer_inputDurations = Input(shape=(normalized_input_durations.shape[1], normalized_input_durations.shape[2]))
input_layers = [layer_inputNotes, layer_inputOffsets, layer_inputDurations]

# Create branches off layers to combine for outputs
shape_inputNotes = (normalized_input_notes.shape[1], normalized_input_notes.shape[2])
shape_inputOffsets = (normalized_input_offsets.shape[1], normalized_input_offsets.shape[2])
shape_inputDurations = (normalized_input_durations.shape[1], normalized_input_durations.shape[2])

branch_inputNotes = Dropout(0.2)(LSTM(256, input_shape=shape_inputNotes, return_sequences=True)(layer_inputNotes))
branch_inputOffsets = Dropout(0.2)(LSTM(256, input_shape=shape_inputOffsets, return_sequences=True)(layer_inputOffsets))
branch_inputDurations = Dropout(0.2)(LSTM(256, input_shape=shape_inputDurations, return_sequences=True)(layer_inputDurations))

input_branches = concatenate([branch_inputNotes, branch_inputOffsets, branch_inputDurations])

# Combines everything learned in LSTMs for notes, durations, and offsets
inputs_combined = LSTM(512, return_sequences=True)(input_branches)
inputs_dropout = Dropout(0.3)(inputs_combined)
inputs_lstm2 = LSTM(512)(inputs_dropout)
inputs_batchNorm = BatchNormalization()(inputs_lstm2)
inputs_dropout2 = Dropout(0.3)(inputs_batchNorm)
inputs_dense = Dense(256, activation='relu')(inputs_dropout2)

# Branch of neural network that classifies notes/chords
layer_outputNotes_dense = Dense(128, activation='relu')(inputs_dense)
layer_outputNotes_batchNorm = BatchNormalization()(layer_outputNotes_dense)
layer_outputNotes_dropout = Dropout(0.3)(layer_outputNotes_batchNorm)
layer_outputNotes = Dense(vocab_length_notes, activation='softmax', name="Note")(layer_outputNotes_dropout)

# Branch of neural network that classifies offsets
layer_outputOffsets_dense = Dense(128, activation='relu')(inputs_dense)
layer_outputOffsets_batchNorm = BatchNormalization()(layer_outputOffsets_dense)
layer_outputOffsets_dropout = Dropout(0.3)(layer_outputOffsets_batchNorm)
layer_outputOffsets = Dense(vocab_length_offsets, activation='softmax', name="Offset")(layer_outputOffsets_dropout)

# Branch of neural network that classifies durations
layer_outputDurations_dense = Dense(128, activation='relu')(inputs_dense)
layer_outputDurations_batchNorm = BatchNormalization()(layer_outputDurations_dense)
layer_outputDurations_dropout = Dropout(0.3)(layer_outputDurations_batchNorm)
layer_outputDurations = Dense(vocab_length_durations, activation='softmax', name="Duration")(layer_outputDurations_dropout)

output_layers = [layer_outputNotes, layer_outputOffsets, layer_outputDurations]

# Define inputs and outputs of model
model = Model(inputs=input_layers, outputs=output_layers)
model.compile(loss='categorical_crossentropy', optimizer='adam')


### Fit Model
If the model has been pre-trained, load in the weights from the weights file location specified in the beginning. Otherwise, fit the model (may take a long time, recommended to run on GPUs).

In [18]:
if pretrained:
    model.load_weights(file_weights)

else:
    filepath = folder_weights + "/weights-{epoch:02d}-{loss:.4f}.hdf5"
    checkpoint = ModelCheckpoint(
        filepath,
        monitor='loss',
        verbose=0,
        save_best_only=True,
        mode='min'
    )
    callbacks_list = [checkpoint]
    inputs = [normalized_input_notes, normalized_input_offsets, normalized_input_durations]
    outputs = [neural_network_output_notes, neural_network_output_offsets, neural_network_output_durations]
    model.fit(inputs, outputs, epochs=num_epochs, batch_size=64, callbacks=callbacks_list, verbose=1)


Generate notes from the neural network for the sequence. Pick a random starting point for the prediction that will be long enough to encapsulate all notes created.

In [19]:
sequence_notes = numpy.random.randint(0, max(0, len(neural_network_input_notes)-1-num_notes_to_generate))
sequence_offsets = numpy.random.randint(0, max(0, len(neural_network_input_offsets)-1-num_notes_to_generate))
sequence_durations = numpy.random.randint(0, max(0, len(neural_network_input_durations)-1-num_notes_to_generate))

print('Starting point of notes/chords: ' + str(sequence_notes) + ' of ' + str(len(neural_network_input_notes)))
print('Starting point of offsets: ' + str(sequence_offsets) + ' of ' + str(len(neural_network_input_offsets)))
print('Starting point of durations: ' + str(sequence_durations) + ' of ' + str(len(neural_network_input_durations)))

Starting point of notes/chords: 20672 of 23287
Starting point of offsets: 14818 of 23287
Starting point of durations: 17766 of 23287


### Generate music
This section creates each note/chord, duration, and offset from the model predictions for the specific number of notes/chords.

In [24]:
pattern_notes = neural_network_input_notes[sequence_notes]
pattern_offsets = neural_network_input_offsets[sequence_offsets]
pattern_durations = neural_network_input_durations[sequence_durations]

int_to_note = dict((number, note) for number, note in enumerate(sorted(set(item for item in all_notes))))
int_to_offset = dict((number, note) for number, note in enumerate(sorted(set(item for item in all_offsets))))
int_to_duration = dict((number, note) for number, note in enumerate(sorted(set(item for item in all_durations))))

prediction_output_notes = []
prediction_output_offsets = []
prediction_output_durations = []

for note_index in range(num_notes_to_generate):
	inputPredict_note = numpy.reshape(pattern_notes, (1, len(pattern_notes), 1))
	predictedNote = inputPredict_note[-1][-1][-1]
	inputPredict_note = inputPredict_note / float(vocab_length_notes)
	
	inputPredict_offset = numpy.reshape(pattern_offsets, (1, len(pattern_offsets), 1))
	inputPredict_offset = inputPredict_offset / float(vocab_length_offsets)
	
	inputPredict_duration = numpy.reshape(pattern_durations, (1, len(pattern_durations), 1))
	inputPredict_duration = inputPredict_duration / float(vocab_length_durations)

	prediction = model.predict([inputPredict_note, inputPredict_offset, inputPredict_duration], verbose=0)

	result_note = numpy.argmax(prediction[0])
	result_offset = numpy.argmax(prediction[1])
	result_duration = numpy.argmax(prediction[2])
	
	if ('.' in str(int_to_note[result_note])):
		print(str(note_index+1) + " - Chord: " + str(int_to_note[result_note]) 
			+ ", Offset: " + str(int_to_offset[result_offset]) 
			+ ", Duration: " + str(int_to_duration[result_duration]))
	else:
		print(str(note_index+1) + " - Note: " + str(int_to_note[result_note]) 
			+ ", Offset: " + str(int_to_offset[result_offset]) 
			+ ", Duration: " + str(int_to_duration[result_duration]))
	
	# Append prediction outputs to their respective lists
	prediction_output_notes.append(int_to_note[result_note])
	prediction_output_durations.append(int_to_duration[result_duration])

	try:
		prediction_output_offsets = numpy.append(prediction_output_offsets, float(int_to_offset[result_offset]))
	except:
		top, bottom = int_to_offset[result_offset].split('/')
		prediction_output_offsets = numpy.append(prediction_output_offsets, float(float(top)/float(bottom)))

	pattern_notes.append(result_note)
	pattern_notes = pattern_notes[1:len(pattern_notes)]

	pattern_offsets.append(result_offset)
	pattern_offsets = pattern_offsets[1:len(pattern_offsets)]

	pattern_durations.append(result_duration)
	pattern_durations = pattern_durations[1:len(pattern_durations)]

1 - Note: B-4, Offset: 0.0, Duration: 0.5
2 - Chord: 3.5, Offset: 0.5, Duration: 2.0
3 - Note: B-4, Offset: 0.0, Duration: 2.5
4 - Note: F4, Offset: 0.0, Duration: 0.5
5 - Note: F4, Offset: 0.5, Duration: 1.0
6 - Note: C#4, Offset: 0.0, Duration: 0.5
7 - Note: B-4, Offset: 0.5, Duration: 0.5
8 - Note: B-4, Offset: 0.5, Duration: 2.0
9 - Note: G#4, Offset: 0.0, Duration: 0.5
10 - Chord: 3.8, Offset: 0.5, Duration: 0.5
11 - Chord: 3.8, Offset: 0.5, Duration: 1.0
12 - Note: C5, Offset: 0.0, Duration: 0.5
13 - Note: F4, Offset: 0.5, Duration: 0.5
14 - Note: E-5, Offset: 0.5, Duration: 4.0
15 - Note: G#4, Offset: 0.0, Duration: 2.75
16 - Note: G3, Offset: 0.0, Duration: 1.0
17 - Note: F4, Offset: 1.0, Duration: 0.5
18 - Chord: 8.0.3, Offset: 0.5, Duration: 2.0
19 - Note: G#4, Offset: 0.5, Duration: 0.5
20 - Chord: 4.6.11, Offset: 0.5, Duration: 0.5
21 - Chord: 0.5, Offset: 0.5, Duration: 0.5
22 - Chord: 5.10, Offset: 0.0, Duration: 1.0
23 - Chord: 0.5, Offset: 0.5, Duration: 0.5
24 - Note: 

### Create MIDI
Combine chords/notes, durations, and offsets into a MIDI file.

In [25]:
note_counter = 0
offset = 0
final_output = []

for pattern in prediction_output_notes:
    # duration
    try:
        new_duration = float(prediction_output_durations[note_counter])
    except:
        top, bottom = prediction_output_durations[note_counter].split('/')
        new_duration = float(top)/float(bottom)

    # if chord
    if pattern.isdigit() or ('.' in pattern): 
        notes_in_chord = pattern.split('.')
        notes = []
        for current_note in notes_in_chord:
            subnote = note.Note(int(current_note))
            subnote.storedInstrument = instrument.Piano()
            notes.append(subnote)
        new_chord = chord.Chord(notes)
        new_chord.duration.quarterLength = new_duration
        new_chord.offset = offset
        
        final_output.append(new_chord)
    #if note
    else:
        new_note = note.Note(pattern)
        new_note.offset = offset
        new_note.storedInstrument = instrument.Piano()
        new_note.duration.quarterLength = new_duration
        
        final_output.append(new_note)

    # during each iteration, increase the offset so notes do not stack on each other
    try:
        offset += prediction_output_offsets[note_counter]
    except:
        top, bottom = prediction_output_offsets[note_counter].split('/')
        offset += top/bottom
            
    note_counter += 1

print(final_output)

[<music21.note.Note B->, <music21.chord.Chord E- F>, <music21.note.Note B->, <music21.note.Note F>, <music21.note.Note F>, <music21.note.Note C#>, <music21.note.Note B->, <music21.note.Note B->, <music21.note.Note G#>, <music21.chord.Chord E- G#>, <music21.chord.Chord E- G#>, <music21.note.Note C>, <music21.note.Note F>, <music21.note.Note E->, <music21.note.Note G#>, <music21.note.Note G>, <music21.note.Note F>, <music21.chord.Chord G# C E->, <music21.note.Note G#>, <music21.chord.Chord E F# B>, <music21.chord.Chord C F>, <music21.chord.Chord F B->, <music21.chord.Chord C F>, <music21.note.Note C>, <music21.note.Note A>, <music21.chord.Chord B- D F>, <music21.note.Note D>, <music21.note.Note G>, <music21.chord.Chord A>, <music21.note.Note C>, <music21.note.Note G>, <music21.note.Note C>, <music21.chord.Chord E G A C>, <music21.chord.Chord E G B>, <music21.chord.Chord A D>, <music21.note.Note F#>, <music21.chord.Chord B E>, <music21.chord.Chord B E>, <music21.chord.Chord D E G A>, <mus

In [26]:
midi_stream = stream.Stream(final_output)
midi_stream.write('midi', fp=file_midi_output)

print("MIDI created as " + file_midi_output)

MIDI created as output/output_midi.mid
