In [None]:
import music21

In [None]:
from music21 import converter, instrument, note, chord

def read_midi(file):

  notes = []
  notes_to_parse = None

  midi = converter.parse(file)

  s2 = instrument.partitionByInstrument(midi)

  for part in s2.parts:

    if "Piano" in str(part):

      notes_to_parse = part.recurse()

      for element in notes_to_parse:

        if isinstance(element, note.Note):
          notes.append(str(element.pitch))

        elif isinstance(element, chord.Chord):
          notes.append(".".join(str(n) for n in element.normalOrder))

  return np.array(notes)            

Defines the function that will be used to read in the notes and chords from the MIDI music files. It creates some empty “container” variables and parses the MIDI file, iterating through the different parts and acting only on the piano music. The function iterates through each element in the piano part, adding each note to the list of notes. If the function encounters a chord, it breaks the chord down into individual notes and adds them to the list. A numpy array of the notes list is returned at the end.

In [None]:
import os
import numpy as np

path = "/content/"

files = [i for i in os.listdir(path) if i.endswith(".mid")]

notes_array = np.array([read_midi(path + i) for i in files])

Creates a list of all appropriate files in the path (MIDI music files). It iterates over the list of files, applying our MIDI reading function to each file and wrapping the results in a numpy array. 

In [None]:
notes_ = [element for note_ in notes_array for element in note_]

unique_notes = list(set(notes_))
print(len(unique_notes))

In [None]:
from collections import Counter
freq = dict(Counter(notes_))

import matplotlib.pyplot as plt

no = [count for _, count in freq.items()]

plt.figure(figsize=(5,5))

plt.hist(no)

In [None]:
frequent_notes = [note_ for note_, count in freq.items() if count>= 50]
print(len(frequent_notes))

Helps us understand the data we have processed. The third cell creates a list of all notes found in the dataset, creates a set of unique notes, and prints the number of unique notes found in the music. Note that the term “unique notes” includes chords in this example, because the program found 316 unique “notes” and a full-size piano has 88 keys. The fourth cell counts the frequency of each note/chord in the dataset and creates a dictionary from the results. It iterates through the items of the dictionary and creates a list of the frequencies. These frequencies are plotted in a histogram. The histogram shows a small portion of the notes are repeated with a high frequency. The fifth cell decides to focus on these frequent notes, creating a new list of notes if they occur over 50 times in the music. 

In [None]:
new_music = []

for notes in notes_array:
  temp = []
  for note_ in notes:
    if note_ in frequent_notes:
      temp.append(note_)
  new_music.append(temp)

new_music = np.array(new_music)      

Creates a new, concise dataset to be used for the rest of the program, because it contains only the most frequent notes/chords from the original dataset. 

In [None]:
timesteps = 32
x = []
y = []

for note_ in new_music:
  for i in range(0, len(note_) - timesteps, 1):

    input_ = note_[i : i + timesteps]
    output_ = note_[i + timesteps]

    x.append(input_)
    y.append(output_)

x = np.array(x)
y = np.array(y)    

Treats the music as a time series, defining the window as 32 notes and the answer as the 33rd. It iterates over the new dataset using this window and saves these slices to x (the 32 notes) and y (the 33rd note) numpy arrays. 

In [None]:
unique_x = list(set(x.ravel()))
x_note_to_int = dict((note_, number) for number, note_ in enumerate(unique_x))

Creates a set from the flattened x array. It uses this set to assign a unique number to each note found in the x array. The assigned number is pulled from the enumerate function called in the second row of the cell. 

In [None]:
x_seq = []

for i in x:
  temp = []
  for j in i:
    temp.append(x_note_to_int[j])
  x_seq.append(temp)

x_seq = np.array(x_seq)   

Uses the dictionary generated in the previous cell to create a new array that contains the same information as the previous x array, but this time the notes are represented by unique numbers instead of strings (“E4” for example). We have vectorized our data, so we now have lists of 32 numbers that will be used to predict the 33rd number that should follow. 

In [None]:
unique_y = list(set(y))
y_note_to_int = dict((note_, number) for number, note_ in enumerate(unique_y))
y_seq = np.array([y_note_to_int[i] for i in y])

Performs the vectorization of the y values, matching the process we used to vectorize the x values. 

In [None]:
from sklearn.model_selection import train_test_split
x_tr, x_val, y_tr, y_val = train_test_split(x_seq, y_seq, test_size = 0.2, random_state=42)

In [None]:
x_tr = np.array(x_tr)
x_val = np.array(x_val)

x_tr = np.expand_dims(x_tr, 1)
x_val = np.expand_dims(x_val, 1)

In [None]:
x_tr = x_tr.astype("float32")
x_val = x_val.astype("float32")
y_tr = y_tr.astype("float32")
y_val = y_val.astype("float32")

Splits our data into train and validation sets. The following cell recasts our x arrays as numpy arrays and expands their dimensions, so they are the correct shape for the LSTM in our model. The final cell turns every value into a float (the model was picky about this). 

In [None]:
x_tr.dtype

In [None]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM, Dense

num_notes = len(frequent_notes)

def lstm():
  model = Sequential()
  model.add(LSTM(128, return_sequences=True))
  model.add(LSTM(128))
  model.add(Dense(256, activation="relu"))
  model.add(Dense(num_notes, activation="softmax"))
  model.compile(optimizer="adam", loss="sparse_categorical_crossentropy")
  return model

Defines our basic LSTM model. It creates a variable equal to the number of possible outcomes for our model. It is a sequential model, with two LSTM layers stacked on top of each other. These are followed by a Dense layer, and then a final Dense layer of size “possible outcomes”. Our final Dense layer has activation “softmax”, so it will assign each possible outcome a probability between 0 and 1 that the next note/chord in the series will be that one. 

In [None]:
from tensorflow.keras.callbacks import ModelCheckpoint
mc = ModelCheckpoint("best_model.h5", monitop="val_loss", mode="min", save_best_only=True, verbose=1)

In [None]:
history = lstm().fit(x_tr, y_tr, batch_size=128, epochs=50, validation_data=(x_val, y_val), verbose=1, callbacks=[mc])

Defines a callback that will be used to save the current version of the model during training only if it outperforms the previous best model. The following cell trains our model on the data we prepared. 