# Generating Markovian music pieces

This notebook includes concepts, functions and some examples of how to use Music21 to create a Markov model for music generation.

### Music 21 is a powerful python package for music manipulation and analysis

The music21 package includes functions to import, manipulate, and export objects (music scores) in a simple way. 

To be able to make use of all its functionality we need to have a music score software backend line [MuseScore](https://musescore.org/en).


## Loading music from the catalog

The catalog is described here: https://web.mit.edu/music21/doc/about/referenceCorpus.html#referenceCorpus

We start by importing the package


In [1]:
# Music21 is a python package for music analysis and manipulation

import music21 as m21 
from music21.note import Note

and we can load a musical piece and assign it to a variable:

In [2]:
bach_846 = m21.corpus.parse('bach/bwv846')

If we have MuseScore installed we can see the music score with ``object.show()`` 

In [None]:
#displaying the score
bach_846.show();

If we want to load an external file (MIDI or MusicXML), we need to use the converter ``music21.converter.parse()`` to parse our file into an object that can be assigned to a variable in the same way

In [7]:
# function to load the midi file (from path) and return the music21 score object
def load_midi(midi_path):
    #  This is the function that parses the file into an object that music21 can work with
    score = m21.converter.parse(
        midi_path,                     # set midi file path
        quantizePost=True,             # quantize note length
        quarterLengthDivisors=(4,3))   # set allowed note lengths
    return score



Using the two MIDI examples "joplin_entertainer.mid" and "palestrina_choral_o_bone_jesu.mid"

In [8]:
joplin_path = 'joplin_entertainer.mid'
pales_path = 'palestrina_choral_o_bone_jesu.mid'

Now the music21 objects:

In [9]:
joplin_m21 = load_midi(joplin_path)
pales_m21 = load_midi(pales_path)

If we inspect the objects we just created and extract their field "text" we can have a better insight about the structure of the object:

In [10]:
joplin_m21.show('text')

{0.0} <music21.stream.Part 0x74e2cafc3580>
    {0.0} <music21.stream.Measure 1 offset=0.0>
        {0.0} <music21.instrument.Instrument "Scott Joplin's Rag-Time performance: ">
        {0.0} <music21.clef.TrebleClef>
        {0.0} <music21.tempo.MetronomeMark animato Quarter=120.0>
        {0.0} <music21.key.Key of C major>
        {0.0} <music21.meter.TimeSignature 4/4>
        {0.0} <music21.note.Rest 3.25ql>
        {3.25} <music21.chord.Chord D6 D5>
        {3.5833} <music21.note.Rest 1/12ql>
        {3.6667} <music21.chord.Chord E6 E5>
    {4.0} <music21.stream.Measure 2 offset=4.0>
        {0.0} <music21.chord.Chord C5 C6>
        {0.3333} <music21.chord.Chord A5 A4>
        {1.0} <music21.chord.Chord B4 B5>
        {1.3333} <music21.chord.Chord G4 G5>
        {1.5833} <music21.note.Rest 5/12ql>
        {2.0} <music21.chord.Chord D5 D4>
        {2.3333} <music21.chord.Chord E5 E4>
        {2.6667} <music21.chord.Chord C4 C5>
        {3.0} <music21.chord.Chord A4 A3>
        {3.75

To extract the information needed to build a markov model, I'm going to use a function posted [here](https://douglasduhaime.com/posts/making-chiptunes-with-markov-models.html) that extracts the pauses and notes in MIDI notation with their respective durations: 

In [11]:
# function to convert the music21 score object to a string
def midi_to_string(score):
    s = ''
    # keep a record of the last time offset seen in the score
    last_offset = 0
    # iterate over each note in the score
    for n in score.flat.notes:
        # measure the time between this note and the previous
        delta = n.offset - last_offset
        # get the duration of this note
        duration = n.duration.components[0].type
        # store the time at which this note started
        last_offset = n.offset
        # if some time elapsed, add a "wait" token
        if delta: s += 'w_{} '.format(delta)
        # add tokens for each note (or each note in a chord)
        notes = [n] if isinstance(n, Note) else n.notes
        for i in notes:
            # add this keypress to the sequence
            s += 'n_{}_{} '.format(i.pitch.midi, duration)
    return s


Now, instead of having a MIDI file, or a music score, we have a single line of text with all the information we need to build a markov model.

In [12]:
joplin_txt = midi_to_string(joplin_m21)

In [13]:
print(joplin_txt)

w_3.25 n_86_eighth n_74_eighth w_0.4166666666666665 n_88_eighth n_76_eighth w_0.3333333333333335 n_72_eighth n_84_eighth w_0.33333333333333304 n_81_quarter n_69_quarter w_0.666666666666667 n_71_eighth n_83_eighth w_0.33333333333333304 n_67_16th n_79_16th w_0.666666666666667 n_74_eighth n_62_eighth w_0.33333333333333304 n_76_eighth n_64_eighth w_1/3 n_60_eighth n_72_eighth w_0.33333333333333304 n_69_eighth n_57_eighth w_0.75 n_71_16th n_59_16th w_0.25 n_67_16th n_55_16th n_71_32nd n_59_32nd w_0.75 n_50_eighth n_62_eighth w_0.25 n_64_16th n_52_16th w_0.5 n_60_eighth n_48_eighth w_0.25 n_57_eighth n_45_eighth w_0.75 n_59_eighth n_47_eighth w_0.25 n_45_eighth n_57_eighth w_0.25 n_44_16th n_56_16th w_0.5 n_43_eighth n_55_eighth w_1.25 n_79_16th n_67_16th n_71_16th n_74_16th n_31_16th n_43_16th w_0.75 n_55_quarter n_59_quarter n_62_eighth w_0.25 n_63_16th w_0.5 n_48_16th n_64_eighth w_0.25 n_72_quarter w_0.5 n_55_16th n_52_16th n_60_16th w_0.25 n_64_eighth w_0.4166666666666661 n_43_16th n_55

From the [same source](https://douglasduhaime.com/posts/making-chiptunes-with-markov-models.html), we can use the function that inverts the process of m21.object -> to create a m21.object from a text file:

In [14]:
from fractions import Fraction

def string_to_midi(s):
  # initialize the sequence into which we'll add notes
  stream = m21.stream.Stream()
  # keep track of the last observed time
  time = 1
  # iterate over each token in our string
  for i in s.split():
    # if the token starts with 'n' it's a note
    if i.startswith('n'):
      # identify the note and its duration
      note, duration = i.lstrip('n_').split('_')
      # create a new note object
      n = m21.note.Note(int(note))
      # specify the note's duration
      n.duration.type = duration
      # add the note to the stream
      stream.insert(time, n)
    # if the token starts with 'w' it's a wait
    elif i.startswith('w'):
      # add the wait duration to the current time
      time += float(Fraction(i.lstrip('w_')))
  # return the stream we created
  return stream

To create a markov sequence of order "n" we can collect all the sub-sequences of size "n" in the text version of the musical piece using the function ``ngrams()`` from natural language toolkit (nlkt):

In [15]:
from nltk import ngrams

ngrams_2order = list(ngrams(joplin_txt.split(),2))

In [16]:
ngrams_2order

[('w_3.25', 'n_86_eighth'),
 ('n_86_eighth', 'n_74_eighth'),
 ('n_74_eighth', 'w_0.4166666666666665'),
 ('w_0.4166666666666665', 'n_88_eighth'),
 ('n_88_eighth', 'n_76_eighth'),
 ('n_76_eighth', 'w_0.3333333333333335'),
 ('w_0.3333333333333335', 'n_72_eighth'),
 ('n_72_eighth', 'n_84_eighth'),
 ('n_84_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_81_quarter'),
 ('n_81_quarter', 'n_69_quarter'),
 ('n_69_quarter', 'w_0.666666666666667'),
 ('w_0.666666666666667', 'n_71_eighth'),
 ('n_71_eighth', 'n_83_eighth'),
 ('n_83_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_67_16th'),
 ('n_67_16th', 'n_79_16th'),
 ('n_79_16th', 'w_0.666666666666667'),
 ('w_0.666666666666667', 'n_74_eighth'),
 ('n_74_eighth', 'n_62_eighth'),
 ('n_62_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_76_eighth'),
 ('n_76_eighth', 'n_64_eighth'),
 ('n_64_eighth', 'w_1/3'),
 ('w_1/3', 'n_60_eighth'),
 ('n_60_eighth', 'n_72_eighth'),
 ('n_72_eighth', 'w_0.3333333333333

And then we can use the same function included [here](https://douglasduhaime.com/posts/making-chiptunes-with-markov-models.html) to generate a sequence of a desired length (``output_length``) with a markov model of order ``ngram_size``. 

In [17]:
from collections import defaultdict
import random

def markov(s, ngram_size=6, output_length=250, random_start=False):
  # create a dictionary to store the mapping of ngrams to possible following tokens
  d = defaultdict(list)
  # make a list of lists where sublists contain word sequences of length ngram_size
  tokens = list(ngrams(s.split(), ngram_size))
  
  # store the map from a token to its following tokens in the dictionary
  # the key is the token and the value is the list of possible following tokens
  for idx, i in enumerate(tokens[:-1]):
    d[i].append(tokens[idx+1])
  
  # choosing a random starting token
  if random_start:
    l = [random.choice(tokens)]
  else: #starting from the same token than the original
      l = [tokens[0]]
      
  # generate the output sequence the length is equal the desired output length
  while len(l) < output_length:
    # chooses the next token randomly from the possible following tokens in the dictionary (get(l[-1]))
    l.append(random.choice(d.get(l[-1], tokens)))
  # format the result into a string
  return ' '.join([' '.join(i) for i in l])


And now we can generate our Markovian sequence and save it to a midi file in two simple steps:

In [18]:
#defining variables
gram_size = 6
out_len = 200
joplin_mv_v1 = markov(joplin_txt, ngram_size=gram_size, output_length=out_len)


And to save it into a MIDI file:

In [19]:
string_to_midi(joplin_mv_v1).write('midi', f'joplin_markov_v1_{gram_size}gram_{out_len}.mid')

'joplin_markov_v1_6gram_200.mid'

## My issue with this representation:

The issue I have with this representation is that chords (or notes that are played together) are taken as if they were time-independent.

For example:

If a music piece has the note sequence: 

    **do_mi - re - mi**
    
with do_mi as a chord, and I want to encode it with a memory of size 3 in time, this means that my 3-gram will include do_mi-re-mi, counting do_mi as a 1-size token because they are played at the same time. This is one of the main differences with text, text is univariate, while music has multiple variables.

If we want to create a model with the same functions that makes more **musical sense**, we can encode chords (notes that are played together) in the same individual token, and re-write the functions previously defined:

In [20]:
def midi_to_stringv2(score):
    # s will store the sequence of notes in string form
    s = ''
    # keep a record of the last time offset seen in the score
    last_offset = 0
    # iterate over each note in the score
    for n in score.flat.notes:
        # measure the time between this note and the previous
        delta = n.offset - last_offset
        # get the duration of this note
        duration = n.duration.components[0].type
        # store the time at which this note started
        last_offset = n.offset
        # if some time elapsed, add a "wait" token
        if delta: s += 'w_{} '.format(delta)
        # add tokens for each note (or each note in a chord)
        notes = [n] if isinstance(n, Note) else n.notes
        note_tokens = []
        for i in notes:
            # this like will add all the notes in the chord together
            note_tokens.append('n_{}_{}'.format(i.pitch.midi, duration))
        # join all note tokens for this sequence and add to s
        s += '_'.join(note_tokens) + ' '
    return s

def string_to_midi2(s):
    # Assuming music21 is already imported and initialized
    # Create a new MIDI stream
    midi_stream = m21.stream.Stream()
    
    time = 1
    # Split the input string into tokens
    tokens = s.split()
    
    for i in tokens:
        if i.startswith('n'):
            # Remove the leading 'n_' and split the rest
            note_info = i.replace("n_", "").split("_")
            # Process each note and duration pair
            for j in range(0, len(note_info), 2):
                note, duration = note_info[j], note_info[j+1]
                # Create a new note object
                n = m21.note.Note(int(note))
                # Assuming 'duration' is a valid duration type for music21
                n.duration.type = duration
                # Add the note to the MIDI stream
                midi_stream.insert(time, n)
        elif i.startswith('w'):
            # add the wait duration to the current time
            time += float(Fraction(i.lstrip('w_')))
            # Handle other token types (e.g., 'w' for wait) as needed
    
    # Save or manipulate midi_stream as needed
    return midi_stream

Generating our new version of the music piece:

In [21]:

joplin_txtv2 = midi_to_stringv2(joplin_m21)

and we can inspect the n-grams generated to verify that they are created in our new representation:

In [22]:
ngrams_2orderv2 = list(ngrams(joplin_txtv2.split(),2))

In [23]:
ngrams_2orderv2

[('w_3.25', 'n_86_eighth_n_74_eighth'),
 ('n_86_eighth_n_74_eighth', 'w_0.4166666666666665'),
 ('w_0.4166666666666665', 'n_88_eighth_n_76_eighth'),
 ('n_88_eighth_n_76_eighth', 'w_0.3333333333333335'),
 ('w_0.3333333333333335', 'n_72_eighth_n_84_eighth'),
 ('n_72_eighth_n_84_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_81_quarter_n_69_quarter'),
 ('n_81_quarter_n_69_quarter', 'w_0.666666666666667'),
 ('w_0.666666666666667', 'n_71_eighth_n_83_eighth'),
 ('n_71_eighth_n_83_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_67_16th_n_79_16th'),
 ('n_67_16th_n_79_16th', 'w_0.666666666666667'),
 ('w_0.666666666666667', 'n_74_eighth_n_62_eighth'),
 ('n_74_eighth_n_62_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_76_eighth_n_64_eighth'),
 ('n_76_eighth_n_64_eighth', 'w_1/3'),
 ('w_1/3', 'n_60_eighth_n_72_eighth'),
 ('n_60_eighth_n_72_eighth', 'w_0.33333333333333304'),
 ('w_0.33333333333333304', 'n_69_eighth_n_57_eighth'),
 ('n_69_eighth_n_5

And generate a MIDI file with a markov model of the second representation:

In [24]:
joplin_mv_v2 = markov(joplin_txtv2, ngram_size=gram_size, output_length=out_len)
string_to_midi2(joplin_mv_v2).write('midi', f'joplin_markov_v2_{gram_size}gram_{out_len}.mid')

'joplin_markov_v2_6gram_200.mid'