#  Cadences in a Corpus



## A. Import Intervals and Other Code

* The first step is to import all the code required for the Notebook
* **`arrow/run`** or **`Shift + Enter`** in the following cell:

In [8]:
import intervals
from intervals import * 
from intervals import main_objs
import intervals.visualizations as viz
import pandas as pd
import re
import altair as alt 
from ipywidgets import interact
from pandas.io.json import json_normalize
from pyvis.network import Network
from IPython.display import display
import requests
import os
import glob as glob

MYDIR = ("saved_csv")
CHECK_FOLDER = os.path.isdir(MYDIR)

# If folder doesn't exist, then create it.
if not CHECK_FOLDER:
    os.makedirs(MYDIR)
    print("created folder : ", MYDIR)

else:
    print(MYDIR, "folder already exists.")
    
MUSDIR = ("Music_Files")
CHECK_FOLDER = os.path.isdir(MUSDIR)

# If folder doesn't exist, then create it.
if not CHECK_FOLDER:
    os.makedirs(MUSDIR)
    print("created folder : ", MUSDIR)

else:
    print(MUSDIR, "folder already exists.")

saved_csv folder already exists.
Music_Files folder already exists.


## B. Importing Corpus

* The **CorpusBase** class is a convenient way to find patterns in any given list of pieces.
* The pieces are provided as a **list**, within square brackets and separated by commas.  
* The bracketed list is then contained within the parentheses of `CorpusBase()`
* For example: `corpus CorpusBase(
       ['https://crimproject.org/mei/CRIM_Mass_0006_1.mei',
       'https://crimproject.org/mei/CRIM_Mass_0006_2.mei',
       'https://crimproject.org/mei/CRIM_Mass_0006_3.mei'])`
* Read the documentation:  `print(CorpusBase.batch.__doc__)`


In [2]:
# this will pull ALL pieces from CRIM on Github
# Note that we exclude various monophonic pieces (which have no contrapuntal cadences)
# and also a few pieces that seem to throw errors for reasons we don't understand.
piece_list = []
raw_prefix = "https://raw.githubusercontent.com/CRIM-Project/CRIM-online/master/crim/static/mei/MEI_4.0/"
URL = "https://api.github.com/repos/CRIM-Project/CRIM-online/git/trees/990f5eb3ff1e9623711514d6609da4076257816c"
piece_json = requests.get(URL).json()
mono_list = ['CRIM_Model_0003.mei', 'CRIM_Model_0004.mei', 'CRIM_Model_0005.mei', 'CRIM_Model_0006.mei', 'CRIM_Model_0007.mei',
            'CRIM_Model_0022.mei', 'CRIM_Model_0028.mei', 'CRIM_Model_0035.mei', 'CRIM_Mass_0029_4.mei', 'CRIM_Mass_0049_2.mei',
            'CRIM_Mass_0049_5.mei']
pattern = 'CRIM_Mass_([0-9]{4}).mei'
for p in piece_json["tree"]:
    p_name = p["path"]
    if re.search(pattern, p_name):
        pass
    elif p_name in mono_list:
        pass
    else:
        piece_list.append(raw_prefix + p["path"])

In [2]:
# Build your own list
piece_list = ['https://crimproject.org/mei/CRIM_Mass_0006_1.mei',
 'https://crimproject.org/mei/CRIM_Mass_0006_2.mei',
 'https://crimproject.org/mei/CRIM_Mass_0006_3.mei']

In [9]:
# use this to make a list of all the pieces in the Music_Files folder

piece_list = []
for name in glob.glob('Music_Files/*'):
    piece_list.append(name)
piece_list

[]

In [3]:
corpus = CorpusBase(piece_list)

Downloading remote score...
Successfully imported https://crimproject.org/mei/CRIM_Mass_0006_1.mei
Downloading remote score...
Successfully imported https://crimproject.org/mei/CRIM_Mass_0006_2.mei
Downloading remote score...
Successfully imported https://crimproject.org/mei/CRIM_Mass_0006_3.mei


In [4]:
corpus.batch(func=ImportedPiece.cadences, verbose=True)


Running cadences analysis on 3 pieces:
	1: Missa Je suis déshéritée: Kyrie
	2: Missa Je suis déshéritée: Gloria
	3: Missa Je suis déshéritée: Credo


[                      CadType  LeadingTones  CVFs Tone RelTone  TSig  Measure  \
 60.0                Authentic           1.0  CuTB    D      P8   4/2        8   
 92.0            Clausula Vera           0.0    TC    D      P8   4/2       12   
 108.0               Authentic           1.0   CTB    D      P8  10/2       14   
 140.0           Clausula Vera           1.0    TC    F      m3   4/2       17   
 148.0               Authentic           1.0    CB    F      m3   4/2       18   
 160.0               Authentic           1.0   TBC    F      m3   4/2       19   
 172.0           Clausula Vera           1.0    CT    F      m3   4/2       21   
 180.0                     NaN           NaN    Cx    F      m3   4/2       22   
 188.0        Evaded Authentic           1.0   CTb    F      m3   4/2       23   
 204.0        Evaded Authentic           1.0   CTb    C      m7   4/2       25   
 212.0           Clausula Vera           0.0    TC    C      m7   4/2       26   
 216.0          

In [11]:
corpus = CorpusBase([
                     'https://crimproject.org/mei/CRIM_Mass_0029_4.mei'])

Downloading remote score...
Successfully imported https://crimproject.org/mei/CRIM_Mass_0029_4.mei


### C. 1 Find the Cadences in the Corpus

* Sample code (remember to omit "()" after the cadences function!
* `func = ImportedPiece.cadences
list_of_dfs = corpus.batch(func=func, metadata=True)
combined_df = pd.concat(list_of_dfs, ignore_index=False)`
* Suggested reorganization of columns in the output:
* `col_list = ['Composer', 'Title', 'Measure', 'Beat', 'CadType', 'Tone','Evaded', 'LeadingTones', 'Low','RelLow','RelTone','Progress','SinceLast','ToNext', 'Validation', 'Comments']`
* `combined_df = combined_df[col_list]`

In [None]:
func = ImportedPiece.cadences
list_of_dfs = corpus.batch(func=func, metadata=True)
combined_df = pd.concat(list_of_dfs, ignore_index=False)


combined_df['Validation'] = ""
combined_df['Comments'] = ""
col_list = ['Composer', 'Title', 'Measure', 'Beat', 'CadType', 'Tone','CVFs',
                'LeadingTones', 'Low','RelLow','RelTone',
                'Progress','SinceLast','ToNext', 'Validation', 'Comments']
combined_df = combined_df[col_list]
combined_df

### C.5.  Summary by Type, Tone, etc

* Here you can report an inventory of cadences by **type** and **tone** (and **evaded** status):
* `combined_df['Tone'].value_counts().to_frame()`




In [30]:
combined_df['CadType'].value_counts().to_frame()


Unnamed: 0,CadType
Clausula Vera,2122
Authentic,1628
Phrygian Clausula Vera,495
Evaded Authentic,493
Evaded Clausula Vera,464
Altizans Only,141
Evaded Altizans Only,125
Phrygian,63
Phrygian Altizans,6
Double Leading Tone,2


* Or, various groupings:
* `combined_df.groupby(['CadType', 'Tone', 'Evaded']).size().reset_index(name='counts')`

In [31]:

grouped_types = combined_df.groupby(['Tone', 'CadType']).size().reset_index(name='counts')
grouped_types.to_csv

Unnamed: 0,Tone,CadType,counts
0,A,Altizans Only,21
1,A,Authentic,154
2,A,Clausula Vera,141
3,A,Evaded Authentic,23
4,A,Evaded Clausula Vera,13
5,A,Phrygian,27
6,A,Phrygian Altizans,2
7,A,Phrygian Clausula Vera,237
8,B,Phrygian,7
9,B,Phrygian Altizans,1


In [32]:
combined_df.to_csv("CRIM_Corpus_Cadences.csv")