# MUSICXML SUPPORT IN MUSX

An overview of loading and processing musical information stored in MusicXML files.
<!-- An overview of loading and processing MusicXML files in musx and accessing/manipulating the loaded data. -->

<hr style="height:1px;color:gray">

Notebook setup:

In [None]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
from musx import version, Note, pitch, Pitch, Interval
from musx.mxml import notation
from musx.mxml import musicxml
print(f"musx.version: {version}")

## Introduction

MusicXML has become the defacto standard for encoding musical scores, and music projects now build databases of music information that can be imported and exported as MusicXML files. This notebook introduces the musx interface for loading and analyzing information encoded in MusicXml files.

To start this overview, run the next cell to see the contents of the canonical 'Hello World' MusicXML file. 
The file's graphic rendition in MuseScore is displayed directly beneath it. Spend a few minutes looking at the image and trying to find its symbolic corollary in the MusicXML text:

In [None]:
with open('support/HelloWorld.musicxml') as file:
    print(file.read())
    print(f"file size: {file.tell()}")

#### MuseScore image:
<img src="support/HelloWorld.png" width="250" />

## The musx.mxml  module

The musx.mxml module contains two layers of support for working with MusicXML:

* The <b>musx.mxml.notation</b> module is a "high-level" interface that directly translates MusicXML entities into python objects representing western symbolic music notation. The high level also contains loading and mapping utilities to access music symbols and perform analysis.

* The <b>musx.mxml.musicxml</b> module is a "low-level" interface containing Python class definitions for every entity defined in the MusicXML schema. The interface was generated from the official [MusicXml 4.0 partwise schema](https://www.w3.org/2021/06/musicxml40/musicxml-reference/elements/score-partwise/) using the [generateDS](https://pypi.org/project/generateDS/) Python package developed by Dave Kuhlman.

## The musx.mxml.notation interface

The high-level notation interface translates a MusicXML score into a musx Notation object containing metadata and assorted musical objects such as parts, voices, bars, notes, rhythms, clefs, keys, etc.  Notations are designed for software programs to easily access score data and perform analysis; it currently has no graphical representation.  New notation symbols can be added to the module by associating the objects's MusicXML name with a parsing function and adding them to the module's parsing dictionary. For information about the existing notation symbols consult the mxml module documentation.

The `notation.load()` function parses a MusicXML file and loads its symbols into a musx Notation object:

In [None]:
hello = notation.load("support/HelloWorld.musicxml")
print(hello)

Use the `notation.print()` function to examine the musical information imported from the file. Start by printing metadata from the loaded notation:

In [None]:
hello.print(metadata=True)

Call `print()` without arguments to display the objects contained in the Notation. Display indentation marks sub-object containers: notations contains parts; parts contain measures; measures contain elements (music symbols such as clefs, key signatures, and notes); and notes possess attributes:  metric start time, metric duration, pitch/rest/chord markers, amplitude, and (possibly) MusicXML markup such as staff and voice ids: 

In [None]:
hello.print()

### Accessing notation objects

Notation objects that contain other objects (e.g. Notation, Part, Measure) are traversable using Python's standard iteration constructs. As an example, this comprehension returns all the parts in the notation:

In [None]:
[part for part in hello]

Here is how to return all the measures in all the parts of the notation:

In [None]:
[measure for part in hello for measure in part]

This returns all the entries in all the measures in all the parts of the notation:

In [None]:
[symbol for part in hello for measure in part for symbol in measure]

This returns only the notes in all the measures in all the ...

In [None]:
[symbol for part in hello for measure in part for symbol in measure if isinstance(symbol, Note)]

It is also possible to access sub-container data using indexing:

In [None]:
hello.parts[0].measures[0].elements[4]

### Timepoints

The previous examples demonstrate how to access notation data in serial order. However, music consist of parts performed in *parallel*, such that the sound at any given time is the sum of all the sounding notes in all parts at that time.  In order to access 'vertical' sonorities in serial (timewise) order, musx provides an analytical structure called a `Timepoint` that containing the set of all vertical notes that are sounding when any note begins. Each Timepoint contains an `onset` (beat) in a measure and a `notemap`: a dictionary whose keys are *part.voice* identifiers and whose values are the notes or chords that begin at that Timepoint's beat.

`Notation.timepoints(trace=False, spanners=False, flatten=False)`

Timepoints() returns a list of all the timepoints in the score.  This list is normally organized into sublist measures; set flatten to true to collect a flat list of all the timepoints in the score. Set the trace parameter to True if you want timepoints printed to the terminal as they are collected. The spanners parameter is more complicated. If spanners is false then a timepoint collects only notes that *begin* at the current timepoint.  If spanners is set to true, then notes that began earlier than the current timepoint but are still sounding during the timepoint are added to the timepoint and marked as a *spanner*. A spanner is distinguishable from other notes in the timepoint by virtue of its earlier start time than the timepoint and its inclusion in the Timepoint.spanners list. Spanners will appear in the trace surrounded by repeat signs '::'.

Here is an example of a second species counterpoint. We will first convert the score into timepoints without spanners and then determine the "horizontal intervals" between melodic notes in each part as well as the "vertical intervals" between the notes of the two parts:

<img src="support/2-000-A_sz18.png" width="600" />

<!-- 
<img src="support/Aus_meines_Hertzens_Grunde.png" width="500" />

bach = notation.load("support/Aus_meines_Herzens_Grunde.musicxml")
bach.timepoints()
chopin = notation.load("support/chopin_prelude_op28_no20.xml")
chopin.timepoints(spanners=True)
-->

In [None]:
species = notation.load("support/species2.musicxml")
print(species)

Parse the score into timepoints organized into measures (sublists) <!-- All the measures but the last contain two timepoints, the first timepoint starts at beat 0 and the second at 1/2 (i.e. half note): -->:

In [None]:
timeline = species.timepoints()
timeline

The timeline output from the previous cell contains 10 measures (sublists), each measure but the last has two timepoints. The first measure contains two timepoints:

    [<Timepoint: 0     P1.1: (C4, 1/2), P2.1: (C3, 1)>,
     <Timepoint: 1/2   P1.1: (B3, 1/2)>]
    
The first timepoint starts at time 0 and contains two notes: the C4 in the upper voice (part 1, voice 1) and the C3 in the lower voice (part 2, voice 1). The second timepoint starts at time 1/2 and contains only the upper note (part 1, voice 1). 

#### Analysis 1: determine the melodic intervals in each part

In order to determine the melodic in each part we need to know which notes belong to which part and voice. This can be determined by looking at the *part.voice* identifiers in the timepoint's *notemap* dictionary: notes for the top melody are identified as P1.1 (part 1 voice 1), and notes for the lower melody are P2.1 (part 2 voice 1).

This can been seen very clearly by looking at the actual attribute data in the first timepoint:

In [None]:
for timepoint in timeline[0]: print(f"onset: {timepoint.onset} notemap: {timepoint.notemap}")

Since the identifiers "P1.1" and "P2.1" distinguish the melodic lines they can be used to collect the notes for each part:

In [None]:
topmelody, bottommelody = [], []
for measure in timeline:
    for timepoint in measure:
        for identifier in timepoint.notemap:
            pitch = timepoint.notemap[identifier].pitch
            if identifier == "P1.1":
                topmelody.append(pitch)
            else:
                bottommelody.append(pitch)

print(f"Top melody:\n{[n.string() for n in topmelody]}", end="\n\n")
print(f"Bottom melody:\n{[n.string() for n in bottommelody]}")

To determine the melodic intervals between notes in a melody, step pairwise through the melody and collect the interval between each pair of notes. Descending melodic intervals will appear with a minus sign. After executing the next cell compare its results with the output from the previous cell:

In [None]:
topintervals = [Interval(left, right) for left, right in zip(topmelody, topmelody[1:])]
bottomintervals = [Interval(left, right) for left, right in zip(bottommelody, bottommelody[1:])]

print(f"Top melody intervals:\n{[i.string() for i in topintervals]}", end="\n\n")
print(f"Bottom melody intervals:\n{[i.string() for i in bottomintervals]}")

#### Analysis 2: Determine the vertical intervals between parts

Determining the vertical intervals between parts is more challenging because the parts do not have the same number of notes: the first timepoint in each measure has a top and bottom note but the second timepoint has only the top note:

                    Top        Bottom                   
    <Timepoint: 0   (C4, 1/2), (C3, 1)> 
                    Top        (no bottom note!)
    <Timepoint: 1/2 (B3, 1/2)>
    

This is a case where spanning is useful: by including spanners, each timepoint will included all the actively sounding notes regardless of part or voice:

In [None]:
timeline = species.timepoints(spanners=True)
timeline

By parsing the timeline with spanners=True, the second timepoint in each measure now includes two notes: the note in the top voice that starts on the second beat, as well as the whole note that is still sounding from the downbeat in the bottom part. (Spanning notes are marked by !! delimiters in the output.):

In [None]:
verticalintervals = []
for measure in timeline:
    for timepoint in measure:
        top, bottom = timepoint.notemap.values()
        verticalintervals.append(Interval(bottom.pitch,top.pitch))
        
print(f'Vertical intervals:\n{[i.string() for i in verticalintervals]}')

## The lowlevel musicxml interface

To work with the low-level interface you must be familiar with MusicXML entities a well as the Python classes in the musx.mxml.musicxml.py module.

`musicxml.parse(inFileName, silence=False, print_warnings=True)`

The `musicxml.parse()` function transforms a MusicXML text file into a graph of Python XML objects. The return value is the root node of the graph.  Note: the first time `parse()` is called it may take Python some seconds to load the very large MusicXML class definition file.

In [None]:
root = musicxml.parse("support/HelloWorld.musicxml", silence=True)
print(f"root node: {root}")

Define a helper function that only returns attributes defined in the MusicXML schema and omitting any attributes that are inherited from the generateDS implementation itself:

In [None]:
def schema_attrs(node):
    return [d for d in dir(node) if not (callable(getattr(node, d)) 
                                         or d.startswith(('_', 'gds', 'subclass', 'superclass', 'tzoff'))
                                         or d.endswith('_'))]

print(f"schema_attrs: {schema_attrs}")

Display the MusicXml attributes of the root node:

In [None]:
schema_attrs(root)

The root.part attribute will contain all the part objects in the score (this score has only one part definition):

In [None]:
root.part

Show the MusicXML attributes of a part:

In [None]:
schema_attrs(root.part[0])

Every MusicXML part will have a unique id:

In [None]:
root.part[0].id

A part can contain one or more measures; our score has only one measure:

In [None]:
root.part[0].measure

Display the attributes of a measure:

In [None]:
schema_attrs(root.part[0].measure[0])

The note attribute of a measure holds the temporal events contained in the measure:

In [None]:
root.part[0].measure[0].note

Display the attributes of a note:

In [None]:
print(schema_attrs(root.part[0].measure[0].note[0]))

This example displays the staff, pitch, rest, and duration values for each of the notes in measure 0. Notice that the first note (the note at index 0) has a pitch and does not have a rest while the second is reversed, it has a rest and not a pitch:

In [None]:
print("Note 0 (top staff):")
print(f"  staff: {root.part[0].measure[0].note[0].staff}")
print(f"  pitch: {root.part[0].measure[0].note[0].pitch}")
print(f"  rest: {root.part[0].measure[0].note[0].rest}")
print(f"  duration: {root.part[0].measure[0].note[0].duration}")
print("Note 1 (bottom staff):")
print(f"  staff: {root.part[0].measure[0].note[1].staff}")
print(f"  pitch: {root.part[0].measure[0].note[1].pitch}")
print(f"  rest: {root.part[0].measure[0].note[1].rest}")
print(f"  duration: {root.part[0].measure[0].note[1].duration}")

The first note (the note at index 0) contains a pitch, here are its attributes:

In [None]:
schema_attrs(root.part[0].measure[0].note[0].pitch)

The step of a pitch is a string letter and the octave of a pitch is an int:

In [None]:
step = root.part[0].measure[0].note[0].pitch.step
octave = root.part[0].measure[0].note[0].pitch.octave
info = [step, octave]
print(info)

Given the step and octave values of a MusicXML pitch it is easy to convert it to a musx Pitch and access the information in different ways. Representations for any other MusicXml attributes can be developed in similar ways.

In [None]:
p = Pitch("".join([str(i) for i in info]))
print(f"Pitch:  {repr(p)}")
print(f"keynum: {p.keynum()}")
print(f"hertz:  {p.hertz()}")
print(f"pc:     {p.pc()}")