# Parse Music XML files instead
Sine Sat. Nov. 13th, 2021



`From Merry Go Round of Life` MIDI file is a challenging one if just to process the raw MIDI file,
due to quality of transcription, where there's slight times where time signature changes, and that comes with many tempos

Hopefully there's some way to process the symbolic representation in Music XML format,
as we need to get the bar separations


## Setup



In [None]:
import pandas as pd
import music21



## music21

`From Merry Go Round of Life` MIDI file, exported using `MuseScore`



In [None]:
fnm = 'Joe Hisaishi - Merry Go Round of Life (bitmidi).mxl'

# Modified from https://www.audiolabs-erlangen.de/resources/MIR/FMP/C1/C1S2_MusicXML.html
def xml_to_list(xml):
    """Convert a music xml file to a list of note events

    Notebook: C1/C1S2_MusicXML.ipynb

    Args:
        xml (str or music21.stream.Score): Either a path to a music xml file or a music21.stream.Score

    Returns:
        score (list): A list of note events where each note is specified as
            ``[start, duration, pitch, velocity, label]``
    """

    if isinstance(xml, str):
        xml_data = music21.converter.parse(xml)
    elif isinstance(xml, music21.stream.Score):
        xml_data = xml
    else:
        raise RuntimeError('midi must be a path to a midi file or music21.stream.Score')

    score = []

    for part in xml_data.parts:
        instrument = part.getInstrument().instrumentName

        for note in part.flat.notes:

            if note.isChord:
                start = note.offset
                duration = note.quarterLength

                for chord_note in note.pitches:
                    pitch = chord_note.ps
                    volume = note.volume.realized
                    score.append([start, duration, pitch, volume, instrument])

            else:
                start = note.offset
                duration = note.quarterLength
                pitch = note.pitch.ps
                volume = note.volume.realized
                score.append([start, duration, pitch, volume, instrument])

    score = sorted(score, key=lambda x: (x[0], x[2]))
    return score

xml_data = music21.converter.parse(fn)
xml_list = xml_to_list(xml_data)

df = pd.DataFrame(xml_list[:9], columns=['Start', 'End', 'Pitch', 'Velocity', 'Instrument'])


