# Getting Started Examples for the Parangonar Library

This notebooks gives an overview of parangonars main functionality:
- offline alignment
- online alignment
- visualization and evaluation
- file I/O
- aligned Data

To start we load the contents of a performance and score alignment file (encoded in the [match file format](https://cpjku.github.io/matchfile/)). This file contains a score, a performance, and a ground truth alignment.

In [28]:
# import libraries
import os
import matplotlib.pyplot as plt
os.environ['KMP_DUPLICATE_LIB_OK']='True'
%matplotlib inline
import parangonar as pa
import partitura as pt
from pathlib import Path
import pandas as pd

In [29]:
# load the example match file included in the library
perf_match, groundtruth_alignment, score_match = pt.load_match(
    filename= pa.EXAMPLE, # 
    create_score=True
)

# compute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
sna_match = score_match[0].note_array()

In [30]:
# plot the ground truth alignment
pa.plot_alignment(pna_match, sna_match, groundtruth_alignment)

The above alignment shows a performance of Mozart KV 265 var 1. The bottom piano roll is extracted from the score, all notes have quantized lengths and coinciding chord onsets. The top piano roll is extracted from the performance MIDI, note onsets are played expressively and the note offsets are sometimes influenced by pedalling: all notes are held while the sustain pedal is pressed.


# Offline Note Matching: 

Different note matchers in parangonar compute offline alignments:
- `AutomaticNoteMatcher`: 
    piano roll-based, hierarchical DTW and combinatorial optimization for pitch-wise note distribution.
    requires scores and performances in the current implementation, but not necessarily.
- `DualDTWNoteMatcher`: 
    symbolic note set-based DTW, pitch-wise onsetDTW, separate handling of ornamentations possible.
    requires scores and performances for sequence representation.
    **Default and SOTA** for standard score to performance matching.
- `TheGlueNoteMatcher`:
    pre-trained neural network for note similarity, useful for large mismatches between versions.
    works on any two MIDI files.
- `AnchorPointNoteMatcher`: 
    semi-automatic version of the `AutomaticNoteMatcher`, useful if annotations can be leveraged as anchor points. 

### `AutomaticNoteMatcher`

In [31]:
matcher = pa.AutomaticNoteMatcher()
pred_alignment = matcher(sna_match, 
                        pna_match,
                        verbose_time=True)

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

### `DualDTWNoteMatcher`

In [32]:
# recompute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
# because this matcher requires grace note info
sna_match = score_match[0].note_array(include_grace_notes=True)
matcher = pa.DualDTWNoteMatcher()
pred_alignment = matcher(sna_match, 
                        pna_match,
                        process_ornaments=True,
                        score_part=score_match[0]) # if a score part is passed, ornaments can be handled seperately

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

### `TheGlueNoteMatcher`

In [33]:
# recompute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
sna_match = score_match[0].note_array()
matcher = pa.TheGlueNoteMatcher()
pred_alignment = matcher(sna_match, 
                         pna_match) 

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

`TheGlueNoteMatcher` made a mistake! We can plot an alignment comparison against the ground truth to find it:

In [34]:
pa.plot_alignment_comparison(pna_match, sna_match, pred_alignment, groundtruth_alignment)

### `AnchorPointNoteMatcher` 

In [35]:
# compute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
sna_match = score_match[0].note_array()

# compute synthetic anchor points every 4 beats
nodes = pa.match.node_array(score_match[0], 
                   perf_match[0], 
                   groundtruth_alignment,
                   node_interval=4)

# match the notes in the note arrays
matcher = pa.AnchorPointNoteMatcher()
pred_alignment = matcher(sna_match, 
                        pna_match,
                        nodes)

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

# Online / Real-time Alignment

Different note matchers in parangonar compute offline alignments:
- `OnlineTransformerMatcher`::
    pre-trained neural network for local alignment decisions.
    post-processing by a tempo model.
- `OnlinePureTransformerMatcher` 
    pre-trained neural network for local alignment decisions.
    no post-processing.
    
For testing convenience, they all have a `offline` method that loops over all performed notes in a `performance_note_array` and calls the `online` method.


### `OnlineTransformerMatcher` 

In [36]:
# compute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
# this matcher requires grace note info
sna_match = score_match[0].note_array(include_grace_notes=True)

matcher = pa.OnlineTransformerMatcher(sna_match)

# the "offline" method loops over all notes in the performance and calls the "online" method for each one.
pred_alignment = matcher.offline(pna_match)

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

### `OnlinePureTransformerMatcher` 

In [37]:
# compute note arrays from the loaded score and performance
pna_match = perf_match[0].note_array()
# this matcher requires grace note info
sna_match = score_match[0].note_array(include_grace_notes=True)

matcher = pa.OnlinePureTransformerMatcher(sna_match)

# the "offline" method loops over all notes in the performance and calls the "online" method for each one.
pred_alignment = matcher.offline(pna_match)

# compute f-score and print the results
pa.print_fscore_alignments(pred_alignment, groundtruth_alignment)

# Visualize and Evaluate Alignments

We have already seen the plotter and printer in action, here they are again:

In [38]:
# this matcher creates an error for the default file, so we can use for visualization
matcher = pa.OnlineTransformerMatcher(sna_match)
pred_alignment = matcher.offline(pna_match)

In [39]:
# show or save plot of note alignment
pa.plot_alignment(pna_match,
                sna_match,
                pred_alignment,
                save_file = False)


In [40]:
# or plot the performance and score as piano rolls given a reference: 
# we can encode errors if given ground truth
# Blue lines indicate correct matches, red lines incorrect ones.
pa.plot_alignment_comparison(pna_match, sna_match, 
                         pred_alignment, groundtruth_alignment)

In [41]:
# compute precision, recall, and f-score of a type in "insertion", "deletion", or "match"
precision, recall, fscore = pa.fscore_alignments(pred_alignment, groundtruth_alignment, types = ["match"]) 
print(fscore)

# File I/O for Note Alignments

Most I/O functions are handled by partitura. 
For [Parangonada](https://sildater.github.io/parangonada/):
- pt.io.importparangonada.load_parangonada_alignment
- pt.io.importparangonada.load_parangonada_csv
- pt.io.exportparangonada.save_parangonada_alignment
- pt.io.exportparangonada.save_parangonada_csv

For [(n)ASAP alignments](https://github.com/CPJKU/asap-dataset)
- pt.io.importparangonada.load_alignment_from_ASAP
- pt.io.exportparangonada.save_alignment_for_ASAP

For [match files](https://cpjku.github.io/matchfile/)
- pt.io.importmatch.load_match
- pt.io.exportmatch.save_match

and a basic interface for saving parangonada-ready csv files is also available:

In [42]:
# export a note alignment for visualization with parangonada:
# https://sildater.github.io/parangonada/
# pa.match.save_parangonada_csv(alignment, performance_data, score_data, outdir="path/to/dir")

In [43]:
# import a corrected note alignment from parangonada:
# https://sildater.github.io/parangonada/
# alignment = pt.io.importparangonada.load_parangonada_alignment(filename= 'path/to/note_alignment.csv')

In [44]:
# load note alignments of the asap dataset: 
# https://github.com/CPJKU/asap-dataset
# alignment = pt.io.importparangonada.load_alignment_from_ASAP(filename= 'path/to/note_alignment.tsv')

# Aligned Data

These note-aligned datasets are publically available:
- [Vienna 4x22](https://github.com/CPJKU/vienna4x22)
- [(n)ASAP note alignments](https://github.com/CPJKU/asap-dataset)
- [Batik Dataset](https://github.com/huispaty/batik_plays_mozart)

Here's how you get started with note alignments on the (n)ASAP Dataset:

In [45]:
BASE_PATH = Path(r"C:\Users\silva\Documents\repos\DATA\asap-dataset")
EXAMPLE_PATH = Path("Bach/Fugue/bwv_846")
PERFORMANCE_NAME = "Shi05M"

In [46]:
# some alignments in (n)ASAP are not great, get a list of non-robust note alignments
df = pd.read_csv(Path(BASE_PATH,"metadata.csv"))
not_robust = df[df["robust_note_alignment"] == 0]
not_robust_list = not_robust["midi_performance"].tolist()
not_robust_list[:3]

In [47]:
# load tsv note alignments of the (n)ASAP dataset: 
alignment = pt.io.importparangonada.load_alignment_from_ASAP(filename= BASE_PATH / 
                                                             EXAMPLE_PATH / 
                                                             Path(PERFORMANCE_NAME + "_note_alignments/note_alignment.tsv") )
# load scores of the (n)ASAP dataset: 
score = pt.load_score(filename=  BASE_PATH / 
                      EXAMPLE_PATH / 
                      'xml_score.musicxml')
# load performance of the (n)ASAP dataset: 
performance = pt.load_performance_midi(filename=  BASE_PATH / 
                                       EXAMPLE_PATH / 
                                       Path(PERFORMANCE_NAME +".mid"))

In [48]:
# load match note alignments of the (n)ASAP dataset: 
# performance, alignment, score = pt.load_match(filename= BASE_PATH / 
#                                              EXAMPLE_PATH / 
#                                              Path(PERFORMANCE_NAME + ".match"), create_score = True )

In [49]:
# sometimes the scores contain multiple parts which are best merged for easier processing
part = pt.score.merge_parts(score)
# sometimes scores contain repeats that need to unfolded to make the alignment make sense
unfolded_part = pt.score.unfold_part_maximal(part)

Beware! MusicXML scores need to be unfolded for (n)ASAP, match files are already unfolded!

Beware 2! unfolding will change the note IDs: they get a suffix -n for the nth repeat of that note. If the folding state is not clear, it's easy to check the note IDs in the note array and the alignment list for the suffixes.

In [50]:
# to get numpy arrays of the score and performance for downstream processing without partitura:
score_array = unfolded_part.note_array()
performance_array = performance.note_array()

In [None]:
pa.plot_alignment(performance_array, score_array, alignment)

In [None]:
# have a look at the score note array
score_array[:8]

In [None]:
# have a look at the performance note array
performance_array[:8]

In [None]:
# have a look at the alignment list
alignment[:8]