# Working with Harmonic Annotations

In [1]:
import dimcat as dc
import pandas as pd
import pitchtypes as pt # this requires the development branch of pitchtypes

## Load corpus

Use dimcat's `Corpus` class to load a dataset.
Each corpus consists of several subcorpora (here only `ABC`),
which in turn consist of several pieces (here `n01_op18-1_01`, `n01_op18-1_02`, etc.).

A `Corpus` has several representations of each piece (e.g. a list of chord labels or a list of notes) called *facets*.
Each facet is represented by a dataframe.

Corpora can be processed, e.g. slicing notes according to different criteria (see below).
The output of these operations is again a corpus with facets.

In [2]:
# this takes some time because it parses the original data, not the preprocessed tsv files
corpus = dc.Corpus()
corpus.load("./ABC")
corpus.data

282 files.
KEY -> EXTENSIONS
-----------------
ABC -> {'.mscx': 71, '.tsv': 211}

None of the 71 score files have been parsed.

All 211 tabular files have been parsed, 70 of them as Annotations object(s).
KEY -> ANNOTATION LAYERS
------------------------
ABC -> staff  voice  harmony_layer  color  
    -> 4      1      1 (dcml)       default    27362
    -> 1      1      0 (dcml)       default      731

## Get notes

In [3]:
notes = corpus.get_facet("notes")
notes

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,quarterbeats,duration_qb,mc,mn,mc_onset,mn_onset,timesig,staff,voice,duration,gracenote,nominal_duration,scalar,tied,tpc,midi,volta,chord_id,tremolo
corpus,fname,interval,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,3,1,1/4,,1/4,1,1,-1,53,,12,
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,4,1,1/4,,1/4,1,1,-1,53,,18,
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,1,1,1/4,,1/4,1,1,-1,65,,0,
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,2,1,1/4,,1/4,1,1,-1,65,,6,
ABC,n01op18-1_01,"[1.0, 1.5)",1,0.5,1,1,1/4,1/4,3/4,3,1,1/8,,1/8,1,-1,-1,53,,13,
ABC,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ABC,n16op135_04,"[1146.0, 1147.0)",1146,1.0,283,282,0,0,4/4,3,1,1/4,,1/4,1,,-1,53,,2731,
ABC,n16op135_04,"[1146.0, 1147.0)",1146,1.0,283,282,0,0,4/4,1,1,1/4,,1/4,1,,3,69,,2729,
ABC,n16op135_04,"[1146.0, 1147.0)",1146,1.0,283,282,0,0,4/4,2,1,1/4,,1/4,1,,3,69,,2730,
ABC,n16op135_04,"[1146.0, 1147.0)",1146,1.0,283,282,0,0,4/4,2,1,1/4,,1/4,1,,-1,77,,2730,


Translate pitch columns to actual pitches:

In [4]:
def get_pitches(tpc, midi):
    """
    Takes the tpc and midi columns of the notes df.
    Returns a SpelledPitchArray
    """
    pcs = pt.SpelledPitchClassArray(tpc)
    alterations = pcs.alteration()
    midi_base = midi - alterations
    octaves = (midi_base // 12) - 1
    return pt.SpelledPitchArray.from_independent(tpc, octaves)

pitches = get_pitches(notes['tpc'], notes['midi'])
pitches

asp(['F3', 'F3', 'F4', ..., 'A4', 'F5', 'F6'])

Assign back into dataframe:

In [5]:
notes['pitch_str'] = pitches.name() # a vector of names
notes.head(20)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,quarterbeats,duration_qb,mc,mn,mc_onset,mn_onset,timesig,staff,voice,duration,gracenote,nominal_duration,scalar,tied,tpc,midi,volta,chord_id,tremolo,pitch_str
corpus,fname,interval,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,3,1,1/4,,1/4,1,1.0,-1,53,,12,,F3
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,4,1,1/4,,1/4,1,1.0,-1,53,,18,,F3
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,1,1,1/4,,1/4,1,1.0,-1,65,,0,,F4
ABC,n01op18-1_01,"[0.0, 1.0)",0,1.0,1,1,0,0,3/4,2,1,1/4,,1/4,1,1.0,-1,65,,6,,F4
ABC,n01op18-1_01,"[1.0, 1.5)",1,0.5,1,1,1/4,1/4,3/4,3,1,1/8,,1/8,1,-1.0,-1,53,,13,,F3
ABC,n01op18-1_01,"[1.0, 1.5)",1,0.5,1,1,1/4,1/4,3/4,4,1,1/8,,1/8,1,-1.0,-1,53,,19,,F3
ABC,n01op18-1_01,"[1.0, 1.5)",1,0.5,1,1,1/4,1/4,3/4,1,1,1/8,,1/8,1,-1.0,-1,65,,1,,F4
ABC,n01op18-1_01,"[1.0, 1.5)",1,0.5,1,1,1/4,1/4,3/4,2,1,1/8,,1/8,1,-1.0,-1,65,,7,,F4
ABC,n01op18-1_01,"[1.5, 1.75)",3/2,0.25,1,1,3/8,3/8,3/4,3,1,1/16,,1/16,1,,1,55,,14,,G3
ABC,n01op18-1_01,"[1.5, 1.75)",3/2,0.25,1,1,3/8,3/8,3/4,4,1,1/16,,1/16,1,,1,55,,20,,G3


Fun things to do:

In [6]:
# express relative to C4
pitches - pt.SpelledPitch("C4")

asi(['-P5:0', '-P5:0', 'P4:0', ..., 'M6:0', 'P4:1', 'P4:2'])