# Folk $n$-gram Analysis (FONN)

*FONN* (pronounced "fun") is an Irish (*Gaeilge*) word for "tune".

In this corpus we present two Polifonia components:

1. aeflkjaef
2. aeflkeajflakjf

This Demo notebook will demonstrate how we use them so far.

### Prerequisites

* In `<basepath>/MIDI` we should have a corpus of folk tunes in MIDI format. By default `basepath` is `./corpus/`. If the corpus is elsewhere, change `basepath` below. We will be writing outputs to subdirectories of `basepath`.

* Install the libraries *Feather*, *PyArrow*, *fastDamerauLevenshtein*, and *Music21*:

    `pip install feather music21 pyarrow fastDamerauLevenshtein`
    
### TODO

* Some TODO notes throughout
* Maybe we should check whether the primary/secondary sequences and n-grams files exist, before running for 40m

In [12]:
import os.path
import sys
sys.path.append("setup_corpus") # TODO we should be able to remove this by making setup_corpus a proper module

import pandas as pd
pd.options.mode.chained_assignment = None
from fastDamerauLevenshtein import damerauLevenshtein

In [13]:
basepath = "./corpus/"
inpath = basepath + "/MIDI"
roots_path = basepath + "/roots.csv"
feat_seq_path=basepath + "/feat_seq_data/note"
accents_path=basepath + "/feat_seq_data/accent"
duration_weighted_path=basepath + "/feat_seq_data/duration_weighted"
ngram_inpath = basepath + "/feat_seq_data/accent"
ngram_outpath = basepath + "/ngrams"
ngram_sim_inpath = basepath + "/ngrams/cre_pitch_class_accents_ngrams_tfidf.ftr" # please check
for path in [inpath, feat_seq_path, accents_path, duration_weighted_path, 
             ngram_inpath, ngram_outpath, ngram_sim_inpath]:
    assert os.path.exists(path)

These two Python scripts contain tools for reading the MIDI data, processing it to find the primary and secondary feature sequences, key-invariant sequences, and duration-weighted sequences.

**TODO** define "primary and secondary feature sequences" very briefly in the notebook (already described in other deliverable of course).

In [16]:
import setup_corpus.setup_corpus as setup_corpus
from setup_corpus.corpus_processing_tools import Music21Corpus, MusicDataCorpus


Setting up lookup table for root assignment:
  note names  midi num  root num
0          C        60         0
1   C# or D-        61         1
2          D        62         2
3   D# or E-        63         3
4          E        64         4 



Setting up Music21 root detection lookup table:
  note name  pitch class
0         C            0
1        C#            1
2        D-            1
3         D            2
4        D#            3 




In [18]:
m21_corpus = Music21Corpus(inpath)

Input corpus contains 1224 melodies:

Tureengarbh Jig, The
Young And Stylish
Fun at the Fair
Buckley the Fiddler  (reel)
Tommy Coen's Reel
Lynch's Hornpipe
Tuttle's Reel
Patrick O'Connor's  (polka)
Patsy Tuohey's (reel)
Beggarman's Reel, The
Up and About In the Morning  (slide)
My Love Between Two Roses (reel)
Ladies Step Up to Tea!
Old Pigeon on the Gate, The
Curragh Races (reel), The
Mangan's Fancy (reel)
McDonagh's Reel 1
One of Tommy's  (hornpipe)
Listowel Fiddler  (slide), The
John Flynn's Jig
Callan Lasses (reel), The
Knocknaboul Reel, The
Jackson's Post-Chaise
Father Skehan's Jig
Tom Connors' Jig
Sailor's Bonnet (reel), The
Pomeroy Fiddler  (reel), The
Her Golden Hair Was Curling Down
Frog in the Puddle , The
Tap the Barrel (reel)
Trip to Durrow (reel), The
Humors of Derrycros(s)ane, The
Nine Mile House  (reel)
Duke of Leinster Hornpipe, The
Mason's Apron (reel), The
Pat O'Beirne's Favorite
Gooseberry Bush (reel), The
Carty's Reel
Brother Gildas' Jig
Hunter's Purse (reel), The
L

Running the following cell will take about 15 minutes. It will produce many `csv` files under `<basepath>/feat_seq_data/note`, `<basepath>/feat_seq_data/accent`, `<basepath>/feat_seq_data/duration_weighted`. 

In [None]:
corpus = setup_corpus.SetupCorpus(m21_corpus)
corpus.generate_primary_feat_seqs()
corpus.setup_music_data_corpus()
corpus.run_simple_secondary_feature_sequence_calculations()
corpus.run_key_invariant_sequence_calulations(roots_path)
corpus.run_duration_weighted_sequence_calculations(['pitch', 'pitch_class'])
corpus.save_corpus(
    feat_seq_path=feat_seq_path,
    accents_path=accents_path,
    duration_weighted_path=duration_weighted_path
)



For example, in `A Trip To Galway_note.csv`, the first few lines will be:

In [6]:
df = pd.read_csv(basepath + "/feat_seq_data/note/A Trip To Galway_note.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,MIDI_note,onset,duration,velocity,interval,parsons_code,Parsons_cumsum,chromatic_root,pitch,pitch_class
0,0,74,0.0,1.0,105,0,0,0,4,10,10
1,1,71,1.0,1.0,105,-3,-1,-1,4,7,7
2,2,67,2.0,1.0,80,-4,-1,-2,4,3,3
3,3,64,3.0,1.0,80,-3,-1,-3,4,0,0
4,4,64,4.0,1.0,95,0,0,-3,4,0,0


Here, we see that **TODO** add here

Next, we will calculate the most important $n$-grams in each tune, calculating importance using TF-IDF.

In [7]:
from ngram_tfidf_tools import NgramCorpus
from setup_ngrams_tfidf import SetupNgramsTfidf

Again, the following cell will take about 25 minutes to run.

In [8]:
feature = "pitch_class"
n_vals = list(range(5, 10))
feat_seq_corpus = NgramCorpus(ngram_inpath)
ngram_corpus = SetupNgramsTfidf(feat_seq_corpus, feature, n_vals)
ngram_corpus.extract_ngrams()
ngram_corpus.calculate_tfidf()
ngram_corpus.save_results(outpath=ngram_outpath,
                          corpus_name='cre_pitch_class_accents')


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Primrose Girl (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0      1.00       105         0             0   
4          72    4.0      1.00        95        10             1   
8          71    8.0      1.00       105        -1            -1   
12         69   12.0      0.66        95        -2            -1   
17         62   16.0      1.00       105        -7            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7     -5            7  
4                1               7      5            5  
8                0               7      4            4  
12              -1               7      2            2  
17              -2               7     -5            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Irishman's Blackthorn (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          76    0.0       2.0       105         0             0   
3          72    4.0       2.0        95        -4            -1   
5          74    8.0       1.0       105         2             1   
9          74   12.0       1.0        95         0             0   
14         76   16.0       2.0       105         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9      7            7  
3               -1               9      3            3  
5                0               9      5            5  
9                0               9      5            5  
14               1               9      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Brian o Laimhin (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       2.0       105         0             0   
3          79    4.0       1.0        95        12             1   
7          67    8.0       1.0       105       -12            -1   
11         72   12.0       1.0        95         5             1   
15         67   16.0       2.0       105        -5            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      0            0  
3                1               7     12            0  
7                0               7      0            0  
11               1               7      5            5  
15               0               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, G


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Bill Harte's_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0       1.0       105         0             0   
3          69    3.0       2.0        95         7             1   
5          71    6.0       1.0       105         2             1   
8          69    9.0       1.0        95        -2            -1   
11         62   12.0       1.0       105        -7            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      0            0  
3                1               2      7            7  
5                2               2      9            9  
8                1               2      7            7  
11               0               2      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitH

    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          79    0.0       1.0       105         0             0   
2          76    2.0       1.0       105        -3            -1   
7          71    6.0       1.0        95        -5            -1   
11         74   10.0       1.0       105         3             1   
15         69   14.0       1.0        95        -5            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     15            3  
2               -1               4     12            0  
7               -2               4      7            7  
11              -1               4     10           10  
15              -2               4      5            5  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Jimmy Ward's Jig_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         62    


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Preston's Reel_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       1.0       105         0             0   
3          67    4.0       2.0        95         0             0   
6          66    8.0       1.0       105        -1            -1   
9          66   12.0       1.0        95         0             0   
13         67   16.0       1.0       105         1             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      0            0  
3                0               7      0            0  
6               -1               7     -1           11  
9               -1               7     -1           11  
13               0               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/Gi

14               0               9     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Pinkeen , The_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         69    0.0       1.0       105         0             0   
1         74    1.0       3.0       105         5             1   
2         69    4.0       1.0        95        -5            -1   
5         74    7.0       2.0        95         5             1   
7         71   10.0       2.0        95        -3            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               2      7            7  
1               1               2     12            0  
2               0               2      7            7  
5               1               2     12            0  
7               0               2      9            9  

Reading data from:
/Users/jmmcd/OneDrive


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Ranting Widow  (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0       1.0       105         0             0   
1          64    1.0       2.0       105         2             1   
4          74    5.0       1.0        95        10             1   
8          74    9.0       1.0       105         0             0   
12         62   13.0       1.0        95       -12            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     -2           10  
1                1               4      0            0  
4                2               4     10           10  
8                2               4     10           10  
12               1               4     -2           10  

Reading data from:
/Users/jmmcd/OneDrive - National University of Irelan

15              -1               2      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Dan Sweeney's  1 (polka)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       0.5       105         0             0   
6          69    4.0       1.0       105         0             0   
10         78    8.0       1.0       105         9             1   
15         71   12.0       1.0       105        -7            -1   
18         69   16.0       0.5       105        -2            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9      0            0  
6                0               9      0            0  
10               1               9      9            9  
15               0               9      2            2  
18              -1               9      0            0  

Reading data from

12               0               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Full Sails to Greenland  (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          71    0.0       1.0       105         0             0   
2          79    2.0       1.0       105         8             1   
6          71    6.0       1.0        95        -8            -1   
10         72   10.0       1.0       105         1             1   
14         66   14.0       1.0        95        -6            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      4            4  
2                1               7     12            0  
6                0               7      4            4  
10               1               7      5            5  
14               0               7     -1           11  

Reading da


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Miss Lyons' Fancy (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0       2.0       105         0             0   
3          62    4.0       1.0        95         0             0   
7          74    8.0       1.0       105        12             1   
11         76   12.0       1.0        95         2             1   
14         74   16.0       2.0       105        -2            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      0            0  
3                0               2      0            0  
7                1               2     12            0  
11               2               2     14            2  
14               1               2     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland,

10               0               2      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Top of the Morning (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       1.0       105         0             0   
2          62    2.0       1.0       105        -5            -1   
6          66    6.0       1.0        95         4             1   
10         71   10.0       1.0       105         5             1   
14         69   14.0       1.0        95        -2            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      0            0  
2               -1               7     -5            7  
6                0               7     -1           11  
10               1               7      4            4  
14               0               7      2            2  

Reading data fro

16               0               9      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Biddy Martin (polka)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          78    0.0       1.0       105         0             0   
5          76    4.0       1.0       105        -2            -1   
9          78    8.0       1.0       105         2             1   
14         76   12.0       1.0       105        -2            -1   
17         78   16.0       1.0       105         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     16            4  
5               -1               2     14            2  
9                0               2     16            4  
14              -1               2     14            2  
17               0               2     16            4  

Reading data from:
/U


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Crehan's Fiddle_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
1          71    1.0       1.0       105        -3            -1   
4          69    4.0       1.0        95        -2            -1   
7          62    7.0       1.0       105        -7            -1   
10         64   10.0       1.0        95         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     12            0  
1               -1               2      9            9  
4               -2               2      7            7  
7               -3               2      0            0  
10              -2               2      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/G


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Denis Enright's Slide_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         71    0.0       1.0       105         0             0   
3         69    3.0       2.0        95        -2            -1   
5         69    6.0       3.0        95         0             0   
6         69    9.0       3.0        95         0             0   
7         71   12.0       2.0       105         2             1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               2      9            9  
3              -1               2      7            7  
5              -1               2      7            7  
6              -1               2      7            7  
7               0               2      9            9  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/

    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       1.0       105         0             0   
1          62    1.0       1.0       105        -7            -1   
4          69    4.0       1.0        95         7             1   
7          67    7.0       1.0       105        -2            -1   
10         69   10.0       1.0        95         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      7            7  
1               -1               2      0            0  
4                0               2      7            7  
7               -1               2      5            5  
10               0               2      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Spey in Spate (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0     

    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0      2.00       105         0             0   
3          69    4.0      2.00        95         0             0   
6          71    8.0      0.66       105         2             1   
11         67   12.0      1.00        95        -4            -1   
15         66   16.0      1.00       105        -1            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      7            7  
3                0               2      7            7  
6                1               2      9            9  
11               0               2      5            5  
15              -1               2      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Lady Gordon (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          66


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Thatcher's Mallet (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       1.0       105         0             0   
1          71    1.0       1.0       105         2             1   
5          69    5.0       1.0        95        -2            -1   
9          74    9.0       1.0       105         5             1   
13         78   13.0       1.0        95         4             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      2            2  
1                1               7      4            4  
5                0               7      2            2  
9                1               7      7            7  
13               2               7     11           11  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ire

11               0               2      5            5  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Crooked Reel, The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0      0.66       105         0             0   
5          64    4.0      1.00        95        -5            -1   
8          69    8.0      1.00       105         5             1   
12         72   12.0      1.00        95         3             1   
16         71   16.0      1.00       105        -1            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9      0            0  
5               -1               9     -5            7  
8                0               9      0            0  
12               1               9      3            3  
16               0               9      2            2  

Reading data from:
/User


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Old Oak Tree (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          66    0.0       1.0       105         0             0   
4          57    4.0       1.0        95        -9            -1   
8          66    8.0       1.0       105         9             1   
12         67   12.0       1.0        95         1             1   
16         64   16.0       2.0       105        -3            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      4            4  
4               -1               2     -5            7  
8                0               2      4            4  
12               1               2      5            5  
16               0               2      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland,


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/New Custom House (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0      1.00       105         0             0   
2          69    2.0      1.00       105        -5            -1   
6          64    6.0      0.66        95        -5            -1   
11         72   10.0      1.00       105         8             1   
15         60   14.0      1.00        95       -12            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     12            0  
2               -1               2      7            7  
6               -2               2      2            2  
11              -1               2     10           10  
15              -2               2     -2           10  

Reading data from:
/Users/jmmcd/OneDrive - National University of Irel

15               1               2      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Wren Hornpipe_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0      2.00       105         0             0   
1          64    2.0      0.66       105         2             1   
6          74    6.0      0.24        95        10             1   
11         76   10.0      0.66       105         2             1   
16         72   14.0      1.32        95        -4            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9     -7            5  
1                1               9     -5            7  
6                2               9      5            5  
11               3               9      7            7  
16               2               9      3            3  

Reading data from:
/Users/jm


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Dinny Mescall's Slide_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         69    0.0       1.0       105         0             0   
1         71    1.0       2.0       105         2             1   
3         71    4.0       1.0        95         0             0   
6         74    7.0       1.0        95         3             1   
9         67   10.0       2.0        95        -7            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               7      2            2  
1               1               7      4            4  
3               1               7      4            4  
6               2               7      7            7  
9               1               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/

11              -2               7      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Tenpenny Piece, The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
1          79    1.0       1.0       105         5             1   
4          76    4.0       1.0        95        -3            -1   
7          76    7.0       1.0        95         0             0   
10         79   10.0       1.0       105         3             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     10           10  
1                1               4     15            3  
4                0               4     12            0  
7                0               4     12            0  
10               1               4     15            3  

Reading data from:
/Us


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Micky O'Callaghan's Favorite (h'pipe)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          78    0.0      1.32       105         0             0   
2          74    2.0      1.32       105        -4            -1   
6          74    6.0      1.32        95         0             0   
10         74   10.0      1.32       105         0             0   
14         74   14.0      1.32        95         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     16            4  
2               -1               2     12            0  
6               -1               2     12            0  
10              -1               2     12            0  
14              -1               2     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National Universit

14               0               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Mister Henry's Single Jig_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         74    0.0       1.0       105         0             0   
3         69    3.0       3.0       105        -5            -1   
4         69    6.0       1.0        95         0             0   
7         74    9.0       2.0        95         5             1   
9         66   12.0       2.0        95        -8            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               2     12            0  
3              -1               2      7            7  
4              -1               2      7            7  
7               0               2     12            0  
9              -1               2      4            4  

Reading data from:
/Users/jm

15               2               2      5            5  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Kid, The_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         71    0.0       1.0       105         0             0   
2         74    2.0       2.0       105         3             1   
4         71    5.0       1.0        95        -3            -1   
5         71    6.0       1.0        95         0             0   
7         67    8.0       1.0       105        -4            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               7      4            4  
2               1               7      7            7  
4               0               7      4            4  
5               0               7      4            4  
7              -1               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - Na


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Belles of St. Louis (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       2.0       105         0             0   
3          71    4.0       1.0        95         4             1   
6          71    8.0       1.0       105         0             0   
10         79   12.0       1.0        95         8             1   
14         69   16.0       2.0       105       -10            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      0            0  
3                1               7      4            4  
6                1               7      4            4  
10               2               7     12            0  
14               1               7      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Irelan

14               0               2      9            9  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/I'm Waiting for You (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0      0.66       105         0             0   
3          67    2.0      2.00       105         5             1   
6          71    6.0      1.00        95         4             1   
10         69   10.0      1.00       105        -2            -1   
14         69   14.0      1.00        95         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7     -5            7  
3                1               7      0            0  
6                2               7      4            4  
10               1               7      2            2  
14               1               7      2            2  

Reading data fr

    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       1.0       105         0             0   
3          62    3.0       1.0        95        -7            -1   
6          74    6.0       1.0       105        12             1   
9          69    9.0       1.0        95        -5            -1   
12         67   12.0       1.0       105        -2            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      2            2  
3               -1               7     -5            7  
6                0               7      7            7  
9               -1               7      2            2  
12              -2               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Old Concertina  (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0   

/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Boys Of The Lough (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
2          69    2.0       1.0       105        -5            -1   
7          69    6.0       1.0        95         0             0   
11         74   10.0       1.0       105         5             1   
15         76   14.0       1.0        95         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     12            0  
2               -1               2      7            7  
7               -1               2      7            7  
11               0               2     12            0  
15               1               2     14            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Sporting Days of Easter (reel), The(1)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
4          67    4.0       1.0        95        -7            -1   
8          69    8.0       1.0       105         2             1   
12         69   12.0       2.0        95         0             0   
15         74   16.0       1.0       105         5             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     12            0  
4               -1               2      5            5  
8                0               2      7            7  
12               0               2      7            7  
15               1               2     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National Universi


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Music Club (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       2.0       105         0             0   
3          74    4.0       2.0        95         0             0   
6          74    8.0       2.0       105         0             0   
9          74   12.0       1.0        95         0             0   
12         74   16.0       1.0       105         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     12            0  
3                0               2     12            0  
6                0               2     12            0  
9                0               2     12            0  
12               0               2     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, G

    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
2          71    2.0       1.0       105        -3            -1   
5          71    6.0       1.0        95         0             0   
9          62   10.0       2.0       105        -9            -1   
11         69   14.0       1.0        95         7             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     10           10  
2               -1               4      7            7  
5               -1               4      7            7  
9               -2               4     -2           10  
11              -1               4      5            5  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Padraig O'Keeffe's Slide (2)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0  


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Last Night's Fun_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         64    0.0       2.0       105         0             0   
2         71    3.0       2.0        95         7             1   
4         71    6.0       1.0        95         0             0   
7         64    9.0       2.0       105        -7            -1   
9         71   12.0       1.0        95         7             1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               4      0            0  
2               1               4      7            7  
4               1               4      7            7  
7               0               4      0            0  
9               1               4      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_

12               0               7      9            9  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Humors of Ederney (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          64    0.0       1.0       105         0             0   
2          69    2.0       2.0       105         5             1   
5          64    6.0       1.0        95        -5            -1   
9          69   10.0       2.0       105         5             1   
12         81   14.0       1.0        95        12             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9     -5            7  
2                1               9      0            0  
5                0               9     -5            7  
9                1               9      0            0  
12               2               9     12            0  

Reading data


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/My Love in the Morning_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         67    0.0       1.0       105         0             0   
1         69    1.0       1.0       105         2             1   
4         69    4.0       1.0        95         0             0   
7         72    7.0       2.0       105         3             1   
9         72   10.0       2.0        95         0             0   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               9     -2           10  
1               1               9      0            0  
4               1               9      0            0  
7               2               9      3            3  
9               2               9      3            3  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub

14               0               4      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Girl of the Big House, The_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         64    0.0       1.0       105         0             0   
1         66    1.0       2.0       105         2             1   
3         67    4.0       2.0        95         1             1   
5         69    7.0       2.0       105         2             1   
7         66   10.0       1.0        95        -3            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               2      2            2  
1               1               2      4            4  
3               2               2      5            5  
5               3               2      7            7  
7               2               2      4            4  

Reading data from:
/Users/j

10              -2               2      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Lynch's (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          79    0.0      1.00       105         0             0   
2          76    2.0      1.00       105        -3            -1   
5          71    6.0      1.00        95        -5            -1   
8          67   10.0      1.00       105        -4            -1   
12         71   14.0      0.66        95         4             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     15            3  
2               -1               4     12            0  
5               -2               4      7            7  
8               -3               4      3            3  
12              -2               4      7            7  

Reading data from:
/Users/j


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Gerry Commane's (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          78    0.0       1.0       105         0             0   
2          74    2.0       2.0       105        -4            -1   
5          72    6.0       1.0        95        -2            -1   
9          69   10.0       3.0       105        -3            -1   
11         69   14.0       1.0        95         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2     16            4  
2               -1               2     12            0  
5               -2               2     10           10  
9               -3               2      7            7  
11              -3               2      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, G


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Con Curtin's Big Balloon_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          74    0.0       1.0       105         0             0   
1          71    1.0       1.0       105        -3            -1   
4          64    4.0       1.0        95        -7            -1   
7          71    7.0       1.0       105         7             1   
10         64   10.0       1.0        95        -7            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               4     10           10  
1               -1               4      7            7  
4               -2               4      0            0  
7               -1               4      7            7  
10              -2               4      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland,

/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Lands of Scotland, The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          66    0.0       1.0       105         0             0   
3          69    3.0       1.0        95         3             1   
6          66    6.0       1.0        95        -3            -1   
9          69    9.0       2.0       105         3             1   
11         69   12.0       1.0        95         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7     -1           11  
3                1               7      2            2  
6                0               7     -1           11  
9                1               7      2            2  
11               1               7      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ng

10               2               9      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Pleasant to Start_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       1.0       105         0             0   
3          69    3.0       1.0        95         0             0   
6          71    6.0       1.0       105         2             1   
9          69    9.0       1.0        95        -2            -1   
12         69   12.0       1.0       105         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      7            7  
3                0               2      7            7  
6                1               2      9            9  
9                0               2      7            7  
12               0               2      7            7  

Reading data from:
/User


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Philip O'Beirne's Delight (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0       1.0       105         0             0   
2          66    2.0       1.0       105        -3            -1   
7          66    6.0       1.0        95         0             0   
12         64   10.0       1.0       105        -2            -1   
16         64   14.0       1.0        95         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      7            7  
2               -1               2      4            4  
7               -1               2      4            4  
12              -2               2      2            2  
16              -2               2      2            2  

Reading data from:
/Users/jmmcd/OneDrive - National University of 


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Ahascragh Pig, The_accent.csv
   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         74    0.0       1.0       105         0             0   
1         74    1.0       1.0       105         0             0   
4         69    4.0       2.0        95        -5            -1   
6         74    7.0       1.0       105         5             1   
9         66   10.0       1.0        95        -8            -1   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               2     12            0  
1               0               2     12            0  
4              -1               2      7            7  
6               0               2     12            0  
9              -1               2      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/fol

   MIDI_note  onset  duration  velocity  interval  parsons_code  \
0         67    0.0       1.0       105         0             0   
1         64    1.0       1.0       105        -3            -1   
4         69    4.0       2.0        95         5             1   
6         72    7.0       1.0       105         3             1   
9         72   10.0       1.0        95         0             0   

   Parsons_cumsum  chromatic_root  pitch  pitch_class  
0               0               9     -2           10  
1              -1               9     -5            7  
4               0               9      0            0  
6               1               9      3            3  
9               1               9      3            3  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Her Golden Hair Was Curling Down_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Farewell to the Heather (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          59    0.0       1.0       105         0             0   
1          57    1.0       1.0       105        -2            -1   
5          64    5.0       1.0        95         7             1   
9          69    9.0       1.0       105         5             1   
13         73   13.0       1.0        95         4             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9    -10            2  
1               -1               9    -12            0  
5                0               9     -5            7  
9                1               9      0            0  
13               2               9      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ir


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Primrose Lass (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          71    0.0      2.00       105         0             0   
3          67    4.0      1.00        95        -4            -1   
7          62    8.0      1.00       105        -5            -1   
11         71   12.0      1.00        95         9             1   
14         71   16.0      0.66       105         0             0   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      4            4  
3               -1               7      0            0  
7               -2               7     -5            7  
11              -1               7      4            4  
14              -1               7      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Paddy Fahy's Reel 2_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0      1.00       105         0             0   
2          64    2.0      0.66       105        -3            -1   
7          64    6.0      1.00        95         0             0   
11         70   10.0      3.00       105         6             1   
13         72   14.0      1.00        95         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               0      7            7  
2               -1               0      4            4  
7               -1               0      4            4  
11               0               0     10           10  
13               1               0     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galw

8              -4               2      5            5  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Murphy's (hornpipe)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          69    0.0      0.66       105         0             0   
3          74    2.0      2.00       105         5             1   
6          74    6.0      2.00        95         0             0   
9          74   10.0      1.32       105         0             0   
13         76   14.0      1.32        95         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      7            7  
3                1               2     12            0  
6                1               2     12            0  
9                1               2     12            0  
13               2               2     14            2  

Reading data from:
/Use


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Lovely Lassie Winking (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          66    0.0       1.0       105         0             0   
4          71    4.0       1.0        95         5             1   
8          71    8.0       1.0       105         0             0   
12         71   12.0       1.0        95         0             0   
15         66   16.0       1.0       105        -5            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      4            4  
4                1               2      9            9  
8                1               2      9            9  
12               1               2      9            9  
15               0               2      4            4  

Reading data from:
/Users/jmmcd/OneDrive - National University of Irel


Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Girls of the County Mayo (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          76    0.0       1.0       105         0             0   
3          76    4.0       1.0        95         0             0   
7          69    8.0       2.0       105        -7            -1   
10         71   12.0       1.0        95         2             1   
13         76   16.0       1.0       105         5             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      9            9  
3                0               7      9            9  
7               -1               7      2            2  
10               0               7      4            4  
13               1               7      9            9  

Reading data from:
/Users/jmmcd/OneDrive - National University

16               0               4      7            7  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Flowers of Spring, The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       1.0       105         0             0   
1          69    1.0       1.0       105         2             1   
4          69    4.0       1.0        95         0             0   
7          76    7.0       1.0       105         7             1   
10         67   10.0       2.0        95        -9            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               9     -2           10  
1                1               9      0            0  
4                1               9      0            0  
7                2               9      7            7  
10               1               9     -2           10  

Reading data from:


14               0               7      0            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Tap the Barrel (reel)_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          62    0.0       2.0       105         0             0   
4          69    4.0       1.0        95         7             1   
9          72    8.0       1.0       105         3             1   
13         60   12.0       1.0        95       -12            -1   
17         62   16.0       2.0       105         2             1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               2      0            0  
4                1               2      7            7  
9                2               2     10           10  
13               1               2     -2           10  
17               2               2      0            0  

Reading data from:
/

13               1               2     12            0  

Reading data from:
/Users/jmmcd/OneDrive - National University of Ireland, Galway/GitHub/folk_ngram_analysis/corpus/feat_seq_data/accent/Master's Sporting Paddy (reel), The_accent.csv
    MIDI_note  onset  duration  velocity  interval  parsons_code  \
0          67    0.0       1.0       105         0             0   
2          71    2.0       2.0       105         4             1   
5          62    6.0       1.0        95        -9            -1   
9          71   10.0       2.0       105         9             1   
12         67   14.0       1.0        95        -4            -1   

    Parsons_cumsum  chromatic_root  pitch  pitch_class  
0                0               7      0            0  
2                1               7      4            4  
5                0               7     -5            7  
9                1               7      4            4  
12               0               7      0            0  

Readin

Now we have TF-IDF scores for each $n$-gram for each tune. For example:

In [9]:
df = pd.read_csv(basepath + "/ngrams/cre_pitch_class_accents_ngrams_freq.csv")
df.head()


Unnamed: 0.1,Unnamed: 0,ngram,"Primrose Girl (reel), The_accent_pitch_class_freq","Soldier's Joy, The_accent_pitch_class_freq","Sligo Jig, The_accent_pitch_class_freq","Rambling Connachtman (reel), The_accent_pitch_class_freq",Orange and Green_accent_pitch_class_freq,"London Lasses (reel), The_accent_pitch_class_freq",Dusty Miller_accent_pitch_class_freq,"Rainy Day Jig, The_accent_pitch_class_freq",...,Tear the Calico (reel)_accent_pitch_class_freq,Jack's Alive (reel)_accent_pitch_class_freq,"Tirnaskea Lasses (reel), The_accent_pitch_class_freq",Roaring Mary (reel)_accent_pitch_class_freq,"Master's Sporting Paddy (reel), The_accent_pitch_class_freq","Chattering Magpie (reel), The_accent_pitch_class_freq",Scully Casey's ( ) (hornpipe)_accent_pitch_class_freq,"Ranger (h'pipe), The_accent_pitch_class_freq",freq,idf
0,0,"(0, 0, 0, 0, 0)",0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,595,6.225
1,1,"(0, 0, 0, 0, 0, 0)",0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,494,6.41
2,2,"(0, 0, 0, 0, 0, 0, 0)",0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,456,6.49
3,3,"(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,435,6.537
4,4,"(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,425,6.561


**TODO** Explain a little of the above, or else print out something else to show what we have got from the ngrams.

In [10]:
df[["ngram", "A Trip To Galway_accent_pitch_class_freq", "freq", "idf"]]

Unnamed: 0,ngram,A Trip To Galway_accent_pitch_class_freq,freq,idf
0,"(0, 0, 0, 0, 0)",0,595,6.225
1,"(0, 0, 0, 0, 0, 0)",0,494,6.410
2,"(0, 0, 0, 0, 0, 0, 0)",0,456,6.490
3,"(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",0,435,6.537
4,"(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)",0,425,6.561
...,...,...,...,...
495,"(7, 7, 0, 0, 4)",0,19,9.667
496,"(10, 10, 0, 2, 10)",0,19,9.667
497,"(7, 10, 7, 10, 7, 10)",0,19,9.667
498,"(0, 5, 0, 0, 4)",0,19,9.667


Finally, we will demonstrate some work-in-progress for calculating similarity between tunes, based on similarity between between their $n$-grams. This uses the Damerau-Levenshtein algorithm.

In [11]:
from ngram_pattern_search import NgramSimilarity

In [12]:
pattern_search = NgramSimilarity(ngram_sim_inpath)
pattern_search.extract_candidate_ngrams("Lord McDonald's (reel)", n=6, mode='idx', indices=[0, 1])
pattern_search.setup_test_corpus()
pattern_search.find_similar_patterns(edit_dist_threshold=1)
pattern_search.find_similar_tunes()
print(pattern_search.results)

                            ngram  \
0  (2.0, 7.0, 2.0, 7.0, 4.0, 7.0)   
1  (4.0, 4.0, 7.0, 7.0, 2.0, 4.0)   
2  (7.0, 2.0, 4.0, 0.0, 7.0, 4.0)   
3  (7.0, 4.0, 7.0, 2.0, 7.0, 2.0)   
4  (7.0, 2.0, 4.0, 7.0, 2.0, 4.0)   

   Primrose Girl (reel), The_accent_pitch_class_tfidf  \
0                                                0.0    
1                                                0.0    
2                                                0.0    
3                                                0.0    
4                                                0.0    

   Soldier's Joy, The_accent_pitch_class_tfidf  \
0                                          0.0   
1                                          0.0   
2                                          0.0   
3                                          0.0   
4                                          0.0   

   Sligo Jig, The_accent_pitch_class_tfidf  \
0                                      0.0   
1                                      0.

Corpus n-gram filtering complete.

Searching corpus for similar n-gram patterns...
57 Similar patterns detected:
                            ngram  (2.0, 7.0, 2.0, 7.0, 4.0, 7.0)  \
0  (2.0, 7.0, 2.0, 7.0, 2.0, 7.0)                             1.0   
1       (4.0, 4.0, 7.0, 7.0, 4.0)                             4.0   
2       (7.0, 2.0, 7.0, 4.0, 7.0)                             1.0   
3  (2.0, 7.0, 4.0, 7.0, 4.0, 7.0)                             1.0   
4       (2.0, 7.0, 2.0, 7.0, 4.0)                             1.0   

   (4.0, 4.0, 7.0, 7.0, 2.0, 4.0)  
0                             4.0  
1                             1.0  
2                             4.0  
3                             4.0  
4                             3.0  
Searching corpus for similar tunes...
Similarity results for Lord McDonald's (reel):
                                           title  count
33      Lord McDonald's (reel)_accent_pitch_clas     15
41       Tim Mulloney's (reel)_accent_pitch_clas      5
53 

**TODO** explain something about the above results, or show some more interesting results.