<a href="https://colab.research.google.com/github/bmill42/musical-structure/blob/main/Musical_Markov_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup

In [None]:
!pip install music21 --quiet
!pip install midi_player --quiet

In [None]:
import random
import music21
from midi_player import MIDIPlayer
from midi_player.stylers import basic, cifka_advanced
from fractions import Fraction

We'll work with a range of pitches covering three octaves, from one octave below middle C (MIDI note 60) to two octaves above. I'll explain the full notation for this later, but this code helps map our representation to MIDI numbers.

In [None]:
octave_map = { '-': 48, '=': 60, '+': 72 }

scale_deg_map = { '1': 0, '2': 2, '3': 4, '4': 5, '5': 7, '6': 9, '7': 11 }

def scale_degree_to_num(deg):
    return octave_map[deg[0]] + scale_deg_map[deg[1]]

In [None]:
def tune_to_string(tune, dur=1, wait=1, rhythms=None):
    pitches = tune.split(' ')
    if rhythms is None:
        return ' '.join(['n_{}_{} w_{}.0'.format(str(scale_degree_to_num(p)), dur, wait) for p in pitches])
    else:
        pitch_rhythms = zip(pitches, rhythms)
        return ' '.join(['n_{}_{} w_{}'.format(str(scale_degree_to_num(pr[0])), pr[1], pr[1]) for pr in pitch_rhythms])

def string_to_midi(tune, dur=1, wait=1, rhythms=None):
    s = tune_to_string(tune, dur, wait, rhythms)
    stream = music21.stream.Stream()
    time = 1
    for i in s.split():
        if i.startswith('n'):
            note, duration = i.lstrip('n_').split('_')
            n = music21.note.Note(int(note))
            n.duration.quarterLength = float(duration)
            stream.insert(time, n)
        elif i.startswith('w'):
            time += float(Fraction(i.lstrip('w_')))
    return stream

def play_midi(tune, dur=1, wait=1, rhythms=None):
    midi = string_to_midi(tune, dur, wait, rhythms)
    midi.write('midi', 'generated.midi')
    return MIDIPlayer('generated.midi', 120, styler=cifka_advanced, title='My Player', width='50%')

# Markov model with random weights

Before refining it later, we're going to set up a Markov model based on diatonic scale degrees with random transition probabilities.

First, let's set up a "keyboard" notation that captures all seven diatonic scale degrees with an octave indicator: `-` for a low octave, `=` for the middle octave, and `+` for a high octave. We can think of each item in the `keyboard` list as a white key on the piano.

In [None]:
diatonic_scale_degrees = [str(i) for i in range(1,8)]
octaves = ['-', '=', '+']

keyboard = [o + p for o in octaves for p in diatonic_scale_degrees]

## "Training" the model

Whereas previously we built a model by keeping a list of all the words that followed any given word in the Shakespeare corpus---which let us simply choose  a word at random to generate new sentences---this time we will represent the model as an explicit set of probabilities.

Here's the basic structure: the keys in the top-level dictionary represent the current state of the model, and the keys in the sub-dictionary represent the different possible following states.

```
{
    '1': {
        '1': 2,
        '2': 5,
        '3': 3,
    },
    '2': {
        '1': 5,
        '2': 1,
        '3': 4
    },
    ...
}
```

The values in the sub-dictionary are **weights**. They work like percentages: larger values represent a higher chance of that state being chosen next. But they don't need to add up to one or even be decimal values---we'll use a method that automatically adds up the weights and turns them into a percentage.

For example, if the current state was '1', the chance of moving to '2' would be `5 / (2 + 5 + 3) = 0.5`

To generate a fully randomized model, we just need to add every note from our keyboard to the dictionary, and for each note we build a sub-dictionary that also contains every note, with a random value between 0 and 10.

We'll also follow the convention from the text model where `'START'` and `'END'` states control how generated sequences begin and end. Every single state can potentially lead to an `'END'` token, but we'll constrain `'START'` somewhat: only scale degrees 3 and 5 can begin the melody.

In [None]:
model = dict()

for k in keyboard:
    model[k] = dict()
    for j in keyboard:
        model[k][j] = random.randint(0, 10)
    model[k]['END'] = 10

model['START'] = {
    '=3': 1,
    '=5': 1
}

Let's peek inside and see what the weights look like for scale degree 1 in the middle octave:

In [None]:
model['=1']

## Generating new tokens

We need a new method to generate tokens, since we can't just choose a random token from a list built from the corpus anymore. Instead, we need to sum all the weights and use those to choose the next state.

The `random.choices()` function does exactly this by taking a list of the possible states and a list of the weights. See the docs if you're interested in how it works.

Our `generate()` function is almost identical to the one from the text model.

In [None]:
def next_state(model, cur_state):
    return random.choices(population=list(model[cur_state].keys()), weights=list(model[cur_state].values()), k=1)[0]

def generate(model):
    output = ''
    pitch = next_state(model, 'START')

    while pitch != 'END':
        output += pitch + ' '
        pitch = next_state(model, pitch)

    return output.strip()

We can test the generator out by providing it with the model and any valid state.

In [None]:
next_state(model, '=1')

Running `generate()` will give us a full tune beginning with `'START'` and finishing when it hits an `'END'` state.

In [None]:
new_tune = generate(model)
new_tune

The result may or may not sound slightly more tuneful than the 12-tone rows we generated previously---still pretty random, with a lot of awkward leaps.

In [None]:
play_midi(new_tune)

# Exercise: Markov model with custom weights

Your assignment is to build your own custom model by entering your own weights for the various scale degrees.

**Fill out this dictionary with new weights derived from the example transcriptions of Mitski and Taylor Swift.** Specifically, build one **verse model** and one **chorus model**, using the combined scale degree transition probabilities from the two songs we used as examples in class.

You can either calculate the probabilities by hand by counting the transitions for each scale degree or you can represent the melodies in a form that allows you to calculate the probabilities automatically (the code from the Shakespeare Markov model could be helpful here).

You must have `'START'` and `'END'` tokens, and you should use them strategically: `'START'` should lead to reasonable starting notes and melodies should typically `'END'` after reaching scale degree 1.

In [None]:
verse_model = {
    # create the model here
}
chorus_model = {
    # create the model here
}

**Once your model dictionary is finished,** uncomment and run the following cells to view and hear the melodies you generate.

In [None]:
#custom_verse = generate(verse_model)

#print(custom_verse)
#play_midi(custom_verse)

In [None]:
#custom_chorus = generate(chorus_model)

#print(custom_chorus)
#play_midi(custom_chorus)

# Exercise: Markov model with pitch and rhythm

Our melodies will be more convincing if we give them more varied rhythms. Rhythm can be incorporated directly into the initial model by associating it with the tokens from the beginning (or by training on a corpus that include duration information, like ABC notation), but we'll generate durations for our notes separately and apply them at playback.

We'll represent durations as numbers, with 1 representing the duration of the notes we've seen so far in the MIDI player. A duration of 0.5 is half as long, 2 is twice as long, etc.

We can generate rhythms entirely randomly by using `random.choice()` once for each element in the tune we already generated. The `play_midi()` function already knows how to add rhythms to tunes, so we can just add the list of durations as an argument.

In [None]:
durations = [0.5, 1, 2, 3, 4]
tune_rand_rhythm = [random.choice(durations) for i in range(len(new_tune.split(' ')))]

In [None]:
play_midi(new_tune, rhythms=tune_rand_rhythm)

We'll get better results if we design a separate Markov model for rhythm that accounts for how note durations tend to work in real music.

The `rhythm_model` below is set up just like the pitch model, and I've made some semi-reasonable choices in designing it (e.g. the shortest duration, 0.5, is most likely to be followed by another 0.5, so that together they fill the space of a single 1.0 duration). It isn't required, but *feel free to modify this model*.

**Your exercise is to complete the `generate_rhythms()` function below.** The function should return a list of durations of the same length as the rune that's also provided as an argument.

The code will be similar to the generator functions for previous models, but requires a couple decisions about how to start new sequences, and how to control the length of the duration sequence since it needs to be the same as the tune.

In [None]:
rhythm_model = { # Feel free to modify this
    0.5: {0.5: 0.8, 1: 0.2},
    1: {1: 0.5, 0.5: 0.1, 2: 0.4},
    2: {1: 0.5, 2: 0.3, 4: 0.2},
    3: {0.5: 0.4, 1: 0.6},
    4: {1: 1}
}

def generate_rhythms(tune, r_model):
    rhythms = []
    # your code here
    return rhythms

In [None]:
generated_rhythms = generate_rhythms(new_tune, rhythm_model)
play_midi(new_tune, rhythms=generated_rhythms)

# Extra credit: Interval-based model

We've previously seen examples of higher-order models that take more than one pitch as the current model state, which tend to produce outputs that feel less random over longer outputs.

When generating melodies, we can approximate the idea of a higher-order model without actually constructing states from multiple tokens by using a trick: we can generate melodies in terms of **intervals** rather than individual notes.

An interval by definition consists of two notes, so by naming a single interval, like `5`, we automatically represent two pitches a fifth apart.

But since we're working in diatonic space rather than pitch class space, intervals have to be calculated using "musical math"---the notes `'=1'` and `'=3'` are separated by a *third*, even though they are two steps apart.

**To begin the extra credit assignment, fill out this function** that takes in the `keyboard` that we created earlier, a current pitch (e.g. `'-7'` or `'=5'`), and an interval (a simple integer). It should return the diatonic scale degree that results from applying the interval to the current pitch.

For example, `pitch_from_interval(keyboard, '=7', 4)` should return `'+3'`.

In [None]:
def pitch_from_interval(keyboard, cur_pitch, interval):
    # your code here

In [None]:
pitch_from_interval(keyboard, '=7', 4)

Now try out the following two cells, which build a simple interval model and then generate tunes and rhythms from it. This will only work if you've completed the `pitch_from_interval()` function above.

The default model only allows for very simple melodic motion: taking a step up or down, staying on the same note, or moving up by third.

**To complete the extra credit portion, expand the interval model to allow for other melodic possibilities.** Specifically, make sure that it's possible to leap up and down by fourth, fifth, and sixth. Your model should take account of the fact that in most tonal melodies, large leaps are most often (but not always) followed by a step (movement by second) in the opposite direction.

Finally, the `generate_from_intervals()` function produces a new melody using the interval model, but it's set to end the melody as soon as scale degree 1 appears for the first time. This is not ideal because, as we've seen, it's possible to have scale degree 1 appear in a melody without being the final note.

**To complete the extra credit portion, modify this condition so that melodies still always end on scale degree 1, but so that they don't automatically end the first time it appears.**

In [None]:
interval_model = { # expand this model
    -2: {-2: 0.5, 1: 0.3, 3: 0.2},
    1: {1: 0.2, 2: 0.3, -2: 0.5},
    2: {2: 0.5, -2: 0.5},
    3: {3: 0.5, -2: 0.3, -2: 0.2},
}

def generate_from_intervals(i_model, start_pitch):
    intervals = []
    final_tune = start_pitch
    current_int = random.choice([-2, 1, 2, 3])
    generating = True
    while generating:
        intervals.append(current_int)
        current_int = next_state(interval_model, current_int)
        next_pitch = pitch_from_interval(keyboard, final_tune[-2:], current_int)
        final_tune += ' ' + next_pitch
        if next_pitch[1] == '1': # improve this end condition
            generating = False

    return final_tune

This cell will use the rhythm model from earlier to supplement the new interval-based melodies.

In [None]:
interval_tune = generate_from_intervals(interval_model, '=3')
generated_rhythms = generate_rhythms(interval_tune, rhythm_model)
play_midi(interval_tune, rhythms=generated_rhythms)