# Coconet Synthetic Chrorales with Variations
<p>This notebook is derived from the pytorch implementation of the original coconet paper (https://arxiv.org/pdf/1903.07227.pdf) created by Kevin Donoghue: (https://github.com/kevindonoghue/coconet-pytorch). I was able to get his code to run with a few modifications. This notebook uses some of his code to load the model trained over 30,000 iterations. </P>
<p>Here is what I hope to do:</p>

-  Load the trained model
-  provide a prompt that the model can start with, and produce a four part chorale
-  take that chorale and perform it using Victorian Rational Well Temperament, a tuning that sounds better than twelve tone equal temperament for Bach.
-  use a Bosendorfer Piano sample library instead of the chorus of vocals used in the Donoghue notebook
-  Add variations in they style of Bach's Goldberg Variations (I'm pretty vague about that part)

This is a notebook that implements the paper https://arxiv.org/pdf/1903.07227.pdf in PyTorch. The goal is to generate samples of music, in the form of midi files, that sound like Bach chorales. Each Bach chorale is a piece of music for four voices. These chorales can be encoded in arrays of shape (4, N) where N is the number of 16th notes on the chorale and a value of 60 (say) at i, j indicates that voice is singing the pitch 60 at the jth 16th note. These encodings are in the file Jsb16thSeparated.npz.

I split these encodings into two measure chunks, so arrays of shape (4, 32). After one-hot encoding the entries, they become arrays of shape (4, 32, P) where P is the number of possible pitches.

To train a neural net to generate samples like the training samples, you generate samples which consist of random entries from a chorale plus the location of those entries. The neural net is then trained to predict the rest of the entries. For example, the net might be given the entries of one voice in the chorale and then its job is to predict the rest of the entries. In practice, this works by randomly generating arrays of shape (4, 32) whose entries are 0 and 1. A chorale is multiplied by this array to erase part of its data. Then the partially erased array and the masking array of 0s and 1s are fed through the neural net, which outputs a predicted array of shape (4, 32, P). This output array is compared with the full (4, 32, P) array of the inputted chorale via cross entropy loss, and gradient descent is applied with respect to this loss function. This encourages the network to learn the pitches in the Bach chorale that were erased in the input.

To generate good samples for listening, it helps to repeatedly resample. You generate a completely unmasked chorale, then slowly freeze notes (as if the composer has decided finally that this note is good) and resample with the frozen notes masked. As you resample, you freeze more and more notes, until you're masking all the notes. At this point the sample has been generated.

In [None]:
# installations needed for in-colab midi playback
# !apt install fluidsynth
# !cp /usr/share/sounds/sf2/FluidR3_GM.sf2 ./font.sf2
#!cp /usr/share/sounds/sf2/'Realistic Piano_1_2.sf2' ./font.sf2
# !pip install midi2audio

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.utils.data
import matplotlib.pyplot as plt
import pandas as pd
import mido
import time
from midi2audio import FluidSynth
from IPython.display import Audio, display
import os
import muspy

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
softmax = torch.nn.functional.softmax

base_dir = ''

In [None]:
def play_midi(path):
    """
    A script for playing the midi files in the notebook. path is the path to the midi file to be played, relative to base_dir.
    """
    if os.path.exists('test.wav'):
        os.remove('test.wav')
    FluidSynth('font.sf2').midi_to_audio(base_dir + path, 'test.wav')
    audio = Audio('test.wav')
    display(audio)
    
path = 'test.mid'
play_midi(path)

In [None]:
# load training data
data = np.load('Jsb16thSeparated.npz', encoding='bytes', allow_pickle=True)

In [None]:
# transpose chorales to different keys, so there's more variation in training data
all_tracks = []
for x in data.files:
    for y in data[x]:
        for i in range(-6, 6):
            all_tracks.append(y + i)

print(len(all_tracks))

In [None]:
# determine highest and lowest pitches

max_midi_pitch = -np.inf
min_midi_pitch = np.inf
for x in all_tracks:
    if x.max() > max_midi_pitch:
        max_midi_pitch = int(x.max())
    if x.min() < min_midi_pitch:
        min_midi_pitch = int(x.min())
        
print(max_midi_pitch, min_midi_pitch)

In [None]:
# set global variables

I = 4 # number of voices
T = 32 # length of samples (32 = two 4/4 measures)
P = max_midi_pitch - min_midi_pitch +1 # number of different pitches
batch_size=24

In [None]:
# prepare the training dataset by cutting chorales in 2 measure pieces

train_tracks = []

for track in all_tracks:
    track = track.transpose()
    cut = 0
    while cut < track.shape[1]-T:
        if (track[:, cut:cut+T] > 0).all():
            train_tracks.append(track[:, cut:cut+T] - min_midi_pitch)
        cut += T
        

train_tracks = np.array(train_tracks).astype(int)

In [None]:
print(train_tracks.shape)

In [None]:
# function for converting arrays of shape (T, 4) into midi files
# the input array has entries that are np.nan (representing a rest)
# of an integer between 0 and 127 inclusive

def piano_roll_to_midi(piece):
    """
    piece is a an array of shape (T, 4) for some T.
    The (i,j)th entry of the array is the midi pitch of the jth voice at time i. It's an integer in range(128).
    outputs a mido object mid that you can convert to a midi file by called its .save() method
    """
    piece = np.concatenate([piece, [[np.nan, np.nan, np.nan, np.nan]]], axis=0)

    bpm = 50
    microseconds_per_beat = 60 * 1000000 / bpm

    mid = mido.MidiFile()
    
    tracks = {'soprano': mido.MidiTrack(), 'alto': mido.MidiTrack(),
              'tenor': mido.MidiTrack(), 'bass': mido.MidiTrack()}
    past_pitches = {'soprano': np.nan, 'alto': np.nan,
                    'tenor': np.nan, 'bass': np.nan}
    delta_time = {'soprano': 0, 'alto': 0, 'tenor': 0, 'bass': 0}


    # create a track containing tempo data
    metatrack = mido.MidiTrack()
    metatrack.append(mido.MetaMessage('set_tempo',
                                      tempo=int(microseconds_per_beat), time=0))
    mid.tracks.append(metatrack)

    # create the four voice tracks
    for voice in tracks:
        mid.tracks.append(tracks[voice])
        tracks[voice].append(mido.Message(
            'program_change', program=0, time=0)) # choir aahs=52, piano = 0

    # add notes to the four voice tracks
    for i in range(len(piece)):
        pitches = {'soprano': piece[i, 0], 'alto': piece[i, 1],
                   'tenor': piece[i, 2], 'bass': piece[i, 3]}
        for voice in tracks:
            if np.isnan(past_pitches[voice]):
                past_pitches[voice] = None
            if np.isnan(pitches[voice]):
                pitches[voice] = None
            if pitches[voice] != past_pitches[voice]:
                if past_pitches[voice]:
                    tracks[voice].append(mido.Message('note_off', note=int(past_pitches[voice]),
                                                      velocity=64, time=delta_time[voice]))
                    delta_time[voice] = 0
                if pitches[voice]:
                    tracks[voice].append(mido.Message('note_on', note=int(pitches[voice]),
                                                      velocity=64, time=delta_time[voice]))
                    delta_time[voice] = 0
            past_pitches[voice] = pitches[voice]
            # 480 ticks per beat and each line of the array is a 16th note
            delta_time[voice] += 120

    return mid

In [None]:
class Chorale:
    """
    A class to store and manipulate an array self.arr that stores a chorale.
    """
    def __init__(self, arr, subtract_30=False):
        # arr is an array of shape (4, 32) with values in range(0, 57)
        self.arr = arr.copy()
        if subtract_30:
            self.arr -= 30
            
        # the one_hot representation of the array
        reshaped = self.arr.reshape(-1)
        self.one_hot = np.zeros((I*T, P))
        r = np.arange(I*T)
        self.one_hot[r, reshaped] = 1
        self.one_hot = self.one_hot.reshape(I, T, P)
        

    def to_image(self):
        # visualize the four tracks as a images
        soprano = self.one_hot[0].transpose()
        alto = self.one_hot[1].transpose()
        tenor = self.one_hot[2].transpose()
        bass = self.one_hot[3].transpose()
        
        fig, axs = plt.subplots(1, 4)
        axs[0].imshow(np.flip(soprano, axis=0), cmap='hot', interpolation='nearest')
        axs[0].set_title('soprano')
        axs[1].imshow(np.flip(alto, axis=0), cmap='hot', interpolation='nearest')
        axs[1].set_title('alto')
        axs[2].imshow(np.flip(tenor, axis=0), cmap='hot', interpolation='nearest')
        axs[2].set_title('tenor')
        axs[3].imshow(np.flip(bass, axis=0), cmap='hot', interpolation='nearest')
        axs[3].set_title('bass')
        fig.set_figheight(5)
        fig.set_figwidth(15)
        return fig, axs
    
    def play(self, filename='midi_track.mid'):
        # display an in-notebook widget for playing audio
        # saves the midi file as a file named name in base_dir/midi_files
        
        midi_arr = self.arr.transpose().copy()
        midi_arr += 30
        midi = piano_roll_to_midi(midi_arr)
        midi.save(base_dir + 'midi_files/' + filename)
        play_midi('midi_files/' + filename)
        
    def elaborate_on_voices(self, voices, model):
        # voice is a set consisting of 0, 1, 2, or 3
        # create a mask consisting of the given voices
        # generate a chorale with the same voices as in voices
        mask = np.zeros((I, T))
        y = np.random.randint(P, size=(I, T))
        for i in voices:
            mask[i] = 1
            y[i] = self.arr[i].copy()
        return harmonize(y, mask, model)
# I think we could improve this scoring method. It's never actually used for evaluations.
    def score(self):
        consonance_dict = {0: 1, 1: 0, 2: 0, 3: 1, 4: 1, 5: 1, 6: 0, 
                           7: 1, 8: 1, 9: 1, 10: 0, 11: 0}
        consonance_score = 0
        for k in range(32):
            for i in range(4):
                for j in range(i):
                    consonance_score += consonance_dict[((self.arr[i, k] - self.arr[j, k]) % 12)]
        
        note_score = 0
        for i in range(4):
            for j in range(1, 32):
                if self.arr[i, j] != self.arr[i, j-1]:
                    note_score += 1
        return consonance_score, note_score
        
# harmonize a melody
def harmonize(y, C, model):
    """
    Generate an artificial Bach Chorale starting with y, and keeping the pitches
    where C==1.
    Here C is an array of shape (4, 32) whose entries are 0 and 1.
    The pitches outside of C are repeatedly resampled to generate new values.
    For example, to harmonize the soprano line, let y be random except y[0] 
    contains the soprano line, let C[1:] be 0 and C[0] be 1.
    """
    model.eval()
    with torch.no_grad():
        x = y
        C2 = C.copy()
        num_steps = int(2*I*T)
        alpha_max = .999
        alpha_min = .001
        eta = 3/4
        for i in range(num_steps):
            p = np.maximum(alpha_min, alpha_max - i*(alpha_max-alpha_min)/(eta*num_steps))
            sampled_binaries = np.random.choice(2, size = C.shape, p=[p, 1-p])
            C2 += sampled_binaries
            C2[C==1] = 1
            x_cache = x
            x = model.pred(x, C2)
            x[C2==1] = x_cache[C2==1]
            C2 = C.copy()
        return x
    
def generate_random_chorale(model): # 
    """
    Calls harmonize with random initialization and C=0, masking none 
    and so generates a new sample that sounds like Bach.
    """
    y = np.random.randint(P, size=(I, T)).astype(int)
    C = np.zeros((I, T)).astype(int)
    x = harmonize(y, C, model)
    return (x)

In [None]:
hidden_size = 32

class Unit(nn.Module):
    """
    Two convolution layers each followed by batchnorm and relu, 
    plus a residual connection.
    """
    def __init__(self):
        super(Unit, self).__init__()
        self.conv1 = nn.Conv2d(hidden_size, hidden_size, 3, padding=1)
        self.batchnorm1 = nn.BatchNorm2d(hidden_size)
        self.relu1 = nn.ReLU()
        self.conv2 = nn.Conv2d(hidden_size, hidden_size, 3, padding=1)
        self.batchnorm2 = nn.BatchNorm2d(hidden_size)
        self.relu2 = nn.ReLU()
        
        
    def forward(self, x):
        y = x
        y = self.conv1(y)
        y = self.batchnorm1(y)
        y = self.relu1(y)
        y = self.conv2(y)
        y = self.batchnorm2(y)
        y = y + x
        y = self.relu2(y)
        return y
    
    

class Net(nn.Module):
    """
    A CNN that where you input a starter chorale and a mask and it outputs a prediction for the values
    in the starter chorale away from the mask that are most like the training data.
    """
    def __init__(self):
        super(Net, self).__init__()
        self.initial_conv = nn.Conv2d(2*I, hidden_size, 3, padding=1)
        self.initial_batchnorm = nn.BatchNorm2d(hidden_size)
        self.initial_relu = nn.ReLU()
        self.unit1 = Unit()
        self.unit2 = Unit()
        self.unit3 = Unit()
        self.unit4 = Unit()
        self.unit5 = Unit()
        self.unit6 = Unit()
        self.unit7 = Unit()
        self.unit8 = Unit()
        self.unit9 = Unit()
        self.unit10 = Unit()
        self.unit11 = Unit()
        self.unit12 = Unit()
        self.unit13 = Unit()
        self.unit14 = Unit()
        self.unit15 = Unit()
        self.unit16 = Unit()
        self.affine = nn.Linear(hidden_size*T*P, I*T*P)
        
    def forward(self, x, C):
        # x is a tensor of shape (N, I, T, P)
        # C is a tensor of 0s and 1s of shape (N, I, T)
        # returns a tensor of shape (N, I, T, P)
        
        # get the number of batches
        N = x.shape[0]
        
        # tile the array C out of a tensor of shape (N, I, T, P)
        tiled_C = C.view(N, I, T, 1)
        tiled_C = tiled_C.repeat(1, 1, 1, P)
        
        # mask x and combine it with the mask to produce a tensor of shape (N, 2*I, T, P)
        y = torch.cat((tiled_C*x, tiled_C), dim=1)
        
        # apply the convolution and relu layers
        y = self.initial_conv(y)
        y = self.initial_batchnorm(y)
        y = self.initial_relu(y)
        y = self.unit1(y)
        y = self.unit2(y)
        y = self.unit3(y)
        y = self.unit4(y)
        y = self.unit5(y)
        y = self.unit6(y)
        y = self.unit7(y)
        y = self.unit8(y)
        y = self.unit9(y)
        y = self.unit10(y)
        y = self.unit11(y)
        y = self.unit12(y)
        y = self.unit13(y)
        y = self.unit14(y)
        y = self.unit15(y)
        y = self.unit16(y)
            
        # reshape before applying the fully connected layer
        y = y.view(N, hidden_size*T*P)
        y = self.affine(y)
        
        # reshape to (N, I, T, P)
        y = y.view(N, I, T, P)
                
        return y
    
    def pred(self, y, C):
        # y is an array of shape (I, T) with integer entries in [0, P)
        # C is an array of shape (I, T) consisting of 0s and 1s
        # the entries of y away from the support of C should be considered 'unknown'
        
        # x is shape (I, T, P) one-hot representation of y
        compressed = y.reshape(-1)
        x = np.zeros((I*T, P))
        r = np.arange(I*T)
        x[r, compressed] = 1
        x = x.reshape(I, T, P)
        
        # prep x and C for the plugging into the model
        x = torch.tensor(x).type(torch.FloatTensor).to(device)
        x = x.view(1, I, T, P)
        C2 = torch.tensor(C).type(torch.FloatTensor).view(1, I, T).to(device)
        
        # plug x and C2 into the model
        with torch.no_grad():
            out = self.forward(x, C2).view(I, T, P).cpu().numpy()
            out = out.transpose(2, 0, 1) # shape (P, I, T)
            probs = np.exp(out) / np.exp(out).sum(axis=0) # shape (P, I, T)
            cum_probs = np.cumsum(probs, axis=0) # shape (P, I, T)
            u = np.random.rand(I, T) # shape (I, T)
            return np.argmax(cum_probs > u, axis=0)         
            
        
            

In [None]:
model = Net().to(device)

In [None]:
# uncomment to load the previously trained model
model.load_state_dict(torch.load('model1.pt'))

In [None]:
import logging
def start_logger(fname = 'coconet.log'):
      logger = logging.getLogger()
      fhandler = logging.FileHandler(filename=fname, mode='w')
      formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
      fhandler.setFormatter(formatter)
      logger.addHandler(fhandler)
      logger.setLevel(logging.DEBUG)

# try out the Chorale class functionality with training samples
<code>
start_logger()
high_c = -np.inf
low_c = np.inf
s_high_c = 0
s_low_c = 0
high_n = -np.inf
low_n = np.inf
s_high_n = 0
s_low_n = 0
logging.info('training sample: samp consonance score: cons note_score: note')
logging.info('samp cons note')
for training_samples in range(len(all_tracks)):
    track = train_tracks[training_samples]
    chorale = Chorale(track)
    scores = chorale.score()
    if scores[0] > high_c: 
        high_c = scores[0]
        s_high_c = training_samples
    if scores[0] < low_c: 
        low_c = scores[0]
        s_low_c = training_samples
    if scores[1] > high_n: 
        high_n = scores[1]
        s_high_n = training_samples
    if scores[1] < low_n: 
        low_n = scores[1]
        s_low_n = training_samples
    logging.info(f'{training_samples}    {scores[0]}   {scores[1]}')
    # if training_samples > 5: break
print(f'high_c: {high_c} at sample {s_high_c}, low_c: {low_c} at sample {s_low_c}')    
print(f'high_n: {high_n} at sample {s_high_n}, low_n: {low_n} at sample {s_low_n}')    
<code>
## I really only had to do this once, and I saved the results    

In [None]:
score_sample = 3345 # lowest consonance at score of 142
track = train_tracks[score_sample]
chorale = Chorale(track)
scores = chorale.score()
print(f'{score_sample}    {scores[0]}   {scores[1]}')
chorale.to_image()
chorale.play()
plt.show()

In [None]:
score_sample = 2367 # most notes at 49
track = train_tracks[score_sample]
chorale = Chorale(track)
scores = chorale.score()
print(f'{score_sample}    {scores[0]}   {scores[1]}')
chorale.to_image()
chorale.play()
plt.show()

In [None]:
score_sample = 3 # this one was most consanant and fewest notes
track = train_tracks[score_sample]
chorale = Chorale(track)
scores = chorale.score()
print(f'{score_sample}    {scores[0]}   {scores[1]}')
chorale.to_image()
chorale.play()
plt.show()

In [None]:
# # let's try out a chorale generated by the model, 
# elaborating on the bass track of the last example
# print('-------------')
new_chorale = Chorale(chorale.elaborate_on_voices([3], model)) # chorale was ??
print(new_chorale)
new_chorale.to_image()
new_chorale.play()
plt.show()

In [None]:
# let's try harmonizing a simple melody

melody = [66, 66, 66, 66, 71, 71, 71, 71, 73, 73, 73, 73, 75, 75, 75, 75,
         76, 76, 75, 73, 71, 71, 75, 75, 73, 73, 70, 70, 71, 71, 71, 71]

y = np.random.randint(P, size=(I, T)) # assign random numbers 0 to 56 to an array of shape(4,32)
print(f'y.shape: {y.shape}')
print(f'P: {P}, I: {I}, T: {T}')
y[0] = np.array(melody)-30
D0 = np.ones((1, T)).astype(int) # mask the soprano part off from alteration.
D1 = np.zeros((3, T)).astype(int) # move the other notes around as much as you need to.
D = np.concatenate([D0, D1], axis=0) # build the mask

for _ in range(1):
    chorale = Chorale(harmonize(y, D, model)) # pass the y (4x32), the mask, and the model
    scores = chorale.score()
    print(f'consonance & note scores: {scores}')
    chorale.to_image()
    plt.show()
    chorale.play()

In [None]:
# let's do some more overfitting investigation
# this sample has a suspiciously compelling bass line

sample = [[74, 70, 65, 58], [74, 70, 65, 58], [74, 70, 65, 57], [74, 70, 65, 57],
          [74, 70, 67, 55], [74, 70, 67, 55], [72, 69, 65, 53], [72, 69, 65, 53],
          [70, 70, 67, 55], [70, 70, 67, 55], [70, 69, 67, 51], [70, 67, 67, 51],
          [69, 69, 60, 53], [69, 69, 60, 53], [70, 65, 62, 50], [70, 65, 62, 50], 
          [72, 67, 63, 53], [72, 67, 63, 53], [72, 67, 57, 51], [72, 67, 57, 51], 
          [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46]]
chorale = Chorale(np.array(sample).transpose(), subtract_30=True)
scores = chorale.score()
print(f'consonance & note scores: {scores}')
chorale.play()
chorale.to_image()
plt.show()

In [None]:
sample = (np.array(sample)-30).transpose()

bass_first_measure = sample[3, :16]

training_bass_first_measures = train_tracks[:, 3, :16]

sq_diff = np.power(bass_first_measure - training_bass_first_measures, 2)
distances = np.sum(sq_diff, axis=1)

distances_as_series = pd.Series(distances).sort_values()
candidates = list(distances_as_series.index[:5])
print(candidates)

for c in candidates:
    candidate_chorale = Chorale(train_tracks[c])
    scores = candidate_chorale.score()
    print(f'consonance & note scores: {scores}')
    candidate_chorale.play()
    candidate_chorale.to_image()
    plt.show()
#     track = train_tracks[c]
#     print((track + 30).transpose().tolist())
    
# verdict: the sample simply noticed something which recurs in the chorales, 
# without copying it directly

In [None]:
def pad_number(n):
    """
    prepare numbers for better file storage
    """
    if n == 0:
        return '00000'
    else:
        digits = int(np.ceil(np.log10(n)))
        pad_zeros = 5 - digits
        return '0'* pad_zeros + str(n)


In [None]:
# I have hopelessly screwed up this save to midi file routine. Recover, recover, recover
# create a random chorale using the harmonization method
def save_midi_chorale(chorale, id_number):
    """
    Save an existing chorale in a midi file named {id_number}midi.mid
    """  
    prediction = chorale
    print(f'chorale type: {type(chorale)}')
    prediction = prediction.transpose().tolist()
    prediction = np.array(prediction)
    midi_output = piano_roll_to_midi(prediction)
    save_name = str(pad_number(id_number)) + 'midi.mid'
    midi_output.save(save_name)    

def save_midi_random(id_number):
    """
    Generate an artificial chorale from a random seed 
    """
    prediction = generate_random_chorale(model) + 30 # 30 back on before passing to piano_roll_to_midi
    save_midi_chorale(prediction, id_number)

def save_midi_harm(base_chorale, keep, id_number):
    """
    Keep one voice and harmonize around it with the other three.
    """
    chorale_type = Chorale(base_chorale, subtract_30=True)
    chorale = chorale_type.elaborate_on_voices([keep], model)
    save_midi_chorale(chorale + 30, id_number)

In [None]:
start_logger()

In [None]:
sample = [[74, 70, 65, 58], [74, 70, 65, 58], [74, 70, 65, 57], [74, 70, 65, 57],
          [74, 70, 67, 55], [74, 70, 67, 55], [72, 69, 65, 53], [72, 69, 65, 53],
          [70, 70, 67, 55], [70, 70, 67, 55], [70, 69, 67, 51], [70, 67, 67, 51],
          [69, 69, 60, 53], [69, 69, 60, 53], [70, 65, 62, 50], [70, 65, 62, 50], 
          [72, 67, 63, 53], [72, 67, 63, 53], [72, 67, 57, 51], [72, 67, 57, 51], 
          [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46]]
chorale = np.array(sample).transpose()
save_midi_chorale(chorale,30000)

In [None]:
metrics = []
logging.info('start logging')
logging.info(f'      file       pit pit pit  pitc poly   poly conso class')
logging.info(f'      name       ran use clas entr phony  rate nant  entr')
for examples in range(1):
    keep = 3 # keep one of the melody 0, alto 1, tenor 2, or bass 3
    save_midi_harm(chorale, keep, 30000 + examples) 
    music = muspy.read(filename)
    metric = [os.path.basename(filename),
        muspy.pitch_range(music),
        muspy.n_pitches_used(music),
        muspy.n_pitch_classes_used(music), 
        round(muspy.pitch_entropy(music),2),
        round(muspy.polyphony(music),2), 
        round(muspy.polyphony_rate(music),2),
        round(muspy.scale_consistency(music),2), 
        round(muspy.pitch_class_entropy(music),2)]
    logging.info(metric)
    metrics.append(metric)

In [None]:
metrics

In [None]:
fig, axs = plt.subplots(nrows=4, figsize=(20,10)) # nrows must equal number of tracks
muspy.show_pianoroll(music, grid_linewidth = (0.24), axs=axs )

In [None]:
# let's try out a chorale generated by the model, elaborating on the bass track of the last example
print('-------------')
new_chorale = Chorale(chorale.elaborate_on_voices([3], model))
new_chorale.to_image()
new_chorale.play()


In [None]:
sample = [[74, 70, 65, 58], [74, 70, 65, 58], [74, 70, 65, 57], [74, 70, 65, 57], 
          [74, 70, 67, 55], [74, 70, 67, 55], [72, 69, 65, 53], [72, 69, 65, 53], 
          [70, 70, 67, 55], [70, 70, 67, 55], [70, 69, 67, 51], [70, 67, 67, 51], 
          [69, 69, 60, 53], [69, 69, 60, 53], [70, 65, 62, 50], [70, 65, 62, 50], 
          [72, 67, 63, 53], [72, 67, 63, 53], [72, 67, 57, 51], [72, 67, 57, 51], 
          [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], [70, 65, 65, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], 
          [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46], [70, 65, 62, 46]]
chorale = Chorale(np.array(sample).transpose(), subtract_30=True)
new_chorale.to_image()
chorale.play()

In [None]:
# read in a midi file, check the key, load into piano roll, set up np.array containing Nx4 sample.
# calling program should slice the returned array as needed
def midi_to_input(midi_file):
    music = muspy.read(midi_file)
    if music.key_signatures != []: # check if the midi file includes a key signature - some don't
        root = music.key_signatures[0].root 
        mode = music.key_signatures[0].mode # major or minor
    else: 
        print('Warning: no key signature found. Assuming C major')
        mode = "major"
        root = 0    
    if music.time_signatures != []: # check if the midi file includes a time signature - some don't
        numerator = music.time_signatures[0].numerator
        denominator = music.time_signatures[0].denominator 
    else: 
        print('Warning: no time signature found. Assuming 4/4')
    # turn it into a piano roll
    piano_roll = muspy.to_pianoroll_representation(music,encode_velocity=False) # boolean piano roll if False, default True
    # print(piano_roll.shape) # should be one time step for every click in the midi file
    q = music.resolution # quarter note value in this midi file. 
    q16 = q // 4 # my desired resolution is by 1/16th notes
    print(f'time signatures: {numerator}/{denominator}')
    time_steps = piano_roll.shape[0] // q16
    print(f'music.resolution is q: {q}. q16: {q16} time_steps: {time_steps} 1/16th notes')
    sample= np.zeros(shape=(time_steps,4)).astype(int) # default is float unless .astype(int)
    # This loop is able to load an array of shape N,4 with the notes that are being played in each time step
    for click in range(0,piano_roll.shape[0],q16): # q16 is skip 240 steps for 1/16th note resolution
        voice = 3 # start with the low voices and decrement for the higher voices as notes get higher
        for i in range(piano_roll.shape[1]): # check if any notes are non-zero
            time_interval = (click) // q16 
            if (piano_roll[click][i]): # if velocity anything but zero - unless you set encode_velocity = False
                # if time_interval % 16 == 0:
                #     print(f'time step: {click} at index {i}, time_interval: {time_interval}, voice: {voice}')
                sample[time_interval][voice] = i - root # index to the piano roll with a note - transpose
                voice -= 1 # next instrument will get the higher note
    return (sample,root,mode)            

In [None]:
keys = ('C ','C#','D ','D#','E ','F ','F#','G ','G#','A ','A#','B ')
midi_file = 'ein_feste_burg.mid'
sample,root,mode = midi_to_input(midi_file)
print(f'file in key of {keys[root]} {mode}')
print(f'sample.shape:{sample.shape}. contains {sample.shape[0] // 4} quarter notes')
# To quote documentation, the basic slice syntax is i:j:k 
# where i is the starting index, j is the stopping index, and k is the step (when k > 0).
start_16th_note = 3*4 # skip the first three quarter notes for this one.
two_measures = sample[start_16th_note:32+start_16th_note]
input = np.array(two_measures.transpose())
chorale = Chorale(input, subtract_30=True)
scores = chorale.score()
print(f'consonance and note count {scores[0]}   {scores[1]}')
chorale.to_image()
chorale.play()

In [None]:
keep = 0 # keep one of the melody 0, alto 1, tenor 2, or bass 3
new_chorale = Chorale(chorale.elaborate_on_voices([keep], model)) 
new_chorale.to_image()
scores = new_chorale.score()
print(f'consonance and note count {scores[0]}   {scores[1]}')
new_chorale.play()