## Using Mido

### Mido's Purpose:

Mido is a library in python that is used to parse midi files to python objects. The midi encoding standard is hard to interpret raw as it is essentially in binary (0's and 1's) so having a library to allow easier and quicker interpretation is useful. The python object do however stay true to the general format of midis.

### Midis:

Midis are files that encode music using messages and tracks. Messages describe events such as "note on", "note offs", key changes ect. The following messages are most important for the seismic project of predicting key change:

1. note_on - Signals for a specific note be held down
2. note_off - Signals for a specific note to be released
3. key_signature - Provides the key signature

It is important to note that all events of a time aspect (delta time) and its value is the time that has passed since the last message. However this is not important to the project because key signature only depends on the notes that are being played. Next mido will be used to parse midis:




In [1]:
import mido
from midi_parser.parser import findMidis
from mido import MidiFile
_path = "C:\\Users\\noahs\\Data Science\\Music Generation AI\\midis\\JSB Chorales\\JSB Chorales\\train\\2.mid"
_path2 = "C:\\Users\\noahs\\Data Science\\Music Generation AI\\Deep Bach Pop109\\data\\midis_packaged\\POP909\\001\\001.mid"
_paths = findMidis("C:\\Users\\noahs\\Downloads\\NOAH MIDI")
midi = MidiFile(_paths[50])


Some midis have multiple tracks to represent different voices or instruments. So it is important to iterate over all tracks to get all the messages. This midi only has one track:

In [2]:
midi.tracks

[<midi track 'Cymatics - Millenium MIDI 10 - A Min\x00' 268 messages>]

In [3]:
msgs = [msg for msg in midi.tracks[0]]
msgs

[<meta message track_name name='Cymatics - Millenium MIDI 10 - A Min\x00' time=0>,
 <meta message time_signature numerator=4 denominator=4 clocks_per_click=36 notated_32nd_notes_per_beat=8 time=0>,
 <meta message time_signature numerator=4 denominator=4 clocks_per_click=36 notated_32nd_notes_per_beat=8 time=0>,
 <message note_on channel=0 note=33 velocity=88 time=0>,
 <message note_on channel=0 note=45 velocity=81 time=0>,
 <message note_on channel=0 note=52 velocity=81 time=0>,
 <message note_on channel=0 note=60 velocity=90 time=0>,
 <message note_on channel=0 note=72 velocity=121 time=0>,
 <message note_off channel=0 note=72 velocity=0 time=68>,
 <message note_off channel=0 note=33 velocity=0 time=27>,
 <message note_off channel=0 note=45 velocity=0 time=0>,
 <message note_off channel=0 note=52 velocity=0 time=0>,
 <message note_off channel=0 note=60 velocity=0 time=0>,
 <message note_on channel=0 note=33 velocity=81 time=1>,
 <message note_on channel=0 note=45 velocity=72 time=0>,


Since notes are encoded as numbers this next function converts them to letter notes which are more interpretable:

In [4]:
#Note 0 = C

midiNumModToLetter = {
    0:"C",
    1:"C#",
    2:"D",
    3:"D#",
    4:"E",
    5:"F",
    6:"F#",
    7:"G",
    8:"G#",
    9:"A",
    10:"A#",
    11:"B"
}


allLetters = list(midiNumModToLetter.values())

def midiToLetter(num):
    return midiNumModToLetter[num%12]



midiToLetter(4)

'E'

In [5]:
notes = [midiToLetter(msg.note) for msg in msgs if msg.type == "note_on"]
notes

['A',
 'A',
 'E',
 'C',
 'C',
 'A',
 'A',
 'E',
 'C',
 'C',
 'A',
 'A',
 'E',
 'C',
 'C',
 'C',
 'B',
 'F',
 'F',
 'C',
 'A',
 'G',
 'F',
 'F',
 'C',
 'A',
 'A',
 'F',
 'F',
 'C',
 'A',
 'A',
 'C',
 'C',
 'G',
 'E',
 'D',
 'C',
 'C',
 'G',
 'E',
 'E',
 'C',
 'C',
 'G',
 'E',
 'E',
 'E',
 'D',
 'C',
 'C',
 'G',
 'E',
 'D',
 'C',
 'C',
 'G',
 'E',
 'E',
 'C',
 'C',
 'G',
 'E',
 'E',
 'C',
 'A',
 'A',
 'E',
 'C',
 'C',
 'A',
 'A',
 'E',
 'C',
 'C',
 'A',
 'A',
 'E',
 'C',
 'C',
 'C',
 'B',
 'F',
 'F',
 'C',
 'A',
 'G',
 'F',
 'F',
 'C',
 'A',
 'A',
 'F',
 'F',
 'C',
 'A',
 'A',
 'G',
 'C',
 'C',
 'G',
 'E',
 'D',
 'C',
 'C',
 'G',
 'E',
 'E',
 'C',
 'C',
 'G',
 'E',
 'E',
 'E',
 'D',
 'G',
 'G',
 'D',
 'B',
 'B',
 'G',
 'G',
 'D',
 'B',
 'B',
 'G',
 'G',
 'D',
 'B',
 'B',
 'G',
 'C']

Then this next bit of code gets the proportions of each distinct note:

In [6]:
proportions = dict([(distinctNote, notes.count(distinctNote)/len(notes)) for distinctNote in set(allLetters)])
proportions

{'C': 0.30303030303030304,
 'C#': 0.0,
 'F': 0.09090909090909091,
 'G': 0.14393939393939395,
 'D': 0.06060606060606061,
 'F#': 0.0,
 'A#': 0.0,
 'E': 0.17424242424242425,
 'G#': 0.0,
 'B': 0.06060606060606061,
 'A': 0.16666666666666666,
 'D#': 0.0}

In [7]:
sum([prop for _,prop in proportions.items()])

0.9999999999999999

Neural networks take in vectors not dictionaries so this next function converts the dictionary to a vector:

In [8]:
reverseInd = dict([(value, key) for key, value in midiNumModToLetter.items()])
def convertDictToXSample(p):
    return [
        p["C"],
        p["C#"],
        p["D"],
        p["D#"],
        p["E"],
        p["F"],
        p["F#"],
        p["G"],
        p["G#"],
        p["A"],
        p["A#"],
        p["B"]
    ]
convertDictToXSample(proportions)

[0.30303030303030304,
 0.0,
 0.06060606060606061,
 0.0,
 0.17424242424242425,
 0.09090909090909091,
 0.0,
 0.14393939393939395,
 0.0,
 0.16666666666666666,
 0.0,
 0.06060606060606061]

And ofcourse another important function is finding the key signature of a midi. Not all midis have key signatures so searching for valid midis might take some time. Here is the key signature function:

In [9]:
#Finding a midis key signatue


def findKey(mf):
    for track in mf.tracks:
        for msg in track:
            if(msg.type == "key_signature"):
                return msg.key
    return None


findKey(midi)