# A Very Brief Introduction to Jupyter Notebooks

### Introduction

This **Jupyter Notebook** brings together various tools for the analysis of symbolic music scores used in **The CRIM Project** (https://crimproject.org). It relies on **Pandas**, a popular Python package that makes the manipulaton of tabular information (in things called **data frames**) fast and relatively easy. <br>

Some of these tools are meant to give insights in **one work at a time**.  Others are meant to help us explore **sets of pieces**, or even **an entire corpus**.  There are tools for exploring pitches, durations, melodic and harmonic patterns, contrapuntal types (like cadences and points of imitation), and various tools (like heat maps and networks) that can help us visualize activity in a piece or relationships among several pieces.<br>

These Notebooks are available via the **CRIM Jupyter Hub**, hosted by **Haverford College**:  **https://ds-crim.haverford.edu/**.  Contact Richard Freedman for login and password.


### Run the Notebook ####

* **Jupyter Notebooks** allow anyone to run **Python** code in any browser.  And Haverford's **Jupyter Hub** allows you do so over the internet, without the need to install special software on your own computer.

* **Jupyter Notebooks** are organized as 'cells', which can be **commentary** (like this one, which is static), or **code** (those below, which produce dyanmic output in the form of charts or tabular data frames.  

* To run an individual cell, use the **`arrow/run`** command at the top of the Notebook, or just press **`Shift + Enter`** on your keyboard.
* Use the practice cells below to try out some basic functions.

### Tutorial and Documentation

Learn more about how to use CRIM Intervals here:  https://github.com/HCDigitalScholarship/intervals/blob/rich_dev_22/tutorial/01_Introduction.md


## A. Import Intervals and Other Code

* The first step is to import all the code required for the Notebook
* **`arrow/run`** or **`Shift + Enter`** in the following cell:

In [1]:
import intervals
from intervals import * 
from intervals import main_objs
import intervals.visualizations as viz
import pandas as pd
import re
import altair as alt
import matplotlib.pyplot as plt
import seaborn as sns
from ipywidgets import interact
from pandas.io.json import json_normalize
from pyvis.network import Network
from IPython.display import display
import requests
import os

MYDIR = ("saved_csv")
CHECK_FOLDER = os.path.isdir(MYDIR)

# If folder doesn't exist, then create it.
if not CHECK_FOLDER:
    os.makedirs(MYDIR)
    print("created folder : ", MYDIR)
else:
    print(MYDIR, "folder already exists.")
    
MUSDIR = ("Music_Files")
CHECK_FOLDER = os.path.isdir(MUSDIR)

# If folder doesn't exist, then create it.
if not CHECK_FOLDER:
    os.makedirs(MUSDIR)
    print("created folder : ", MUSDIR)

else:
    print(MUSDIR, "folder already exists.")

saved_csv folder already exists.
Music_Files folder already exists.


## B. Importing Pieces

### B.1 Import a Single Piece and Check Metadata for Title and Composer

- Here you will want to select the appropriate 'prefix' that identifies the location of your file.
- `'Music_Files/'` is for files in the local notebook; `'https://crimproject.org/mei/'` is for the files on CRIM.
- Then provide the full name (and extension) of your music file, such as `'CRIM_Model_0038.mei'`

In [2]:
# Select a prefix:
# prefix = 'Music_Files/'
prefix = 'https://crimproject.org/mei/' 


# Add your filename here
mei_file = 'CRIM_Model_0032.mei'

# These join the strings and import the piece
url = prefix + mei_file
piece = importScore(url)

print(piece.metadata)

{'title': 'Sancta et immaculata virginitas', 'composer': 'Cristóbal de Morales', 'date': 1546}


## All About Notes and Rests

Learn more here: https://github.com/HCDigitalScholarship/intervals/blob/rich_dev_22/tutorial/02_NotesAndRests.md

In [3]:
piece.notes()  

Unnamed: 0,Superius,Altus,Tenor,Bassus
0.0,Rest,Rest,A3,Rest
8.0,,,D3,
12.0,,,A3,
16.0,,,,D3
18.0,,,A3,
...,...,...,...,...
1118.0,G4,,,
1120.0,,,A3,D3
1122.0,F#4,,,
1123.0,E4,,,


### C.3  Count, Sort and Graph Notes

* The Pandas library includes a vast array of standard methods for working with data frames (renaming columns, sorting data, counting categories, etc).  You can read just a few of the basic ones here:  **https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf**

* Using our dataframe of notes+rests (**`nr`**), you can experiment with a few (try them out below):

    * **count the number of rows** (which tells us simply how large the dataframe is):  
>`nr.count`

    * **rename a columns**:  
>`nr.rename(columns = {'[Superius]':'Cantus'})`

    * **stack all the columns** on top of each other to get one list of all the notes:  
>`nr.stack()`

    * **stack and count the number of unique values** (which will tell us how many different tones are in this piece):
>`nr.stack().nunique()`

    * **count the number of each note in each part**:  
>`nr.apply(pd.Series.value_counts).fillna(0).astype(int)`

    * **count and sort** the number of notes in a single voice part: 
    
>`nr.apply(pd.Series.value_counts).fillna(0).astype(int).sort_values(by=nr.columns[0], ascending=False)`
    
* This sorts by the first voice in the score.  If you want to sort by the last, then use `by=nr.columns[-1]` in he request.



In [4]:
# set parameters
combineUnisons = True
combineRests = True
# new pitch table
pitch_order = ['E-2', 'E2', 'F2', 'F#2', 'G2', 'A2', 'B-2', 'B2', 
               'C3', 'C#3', 'D3', 'E-3','E3', 'F3', 'F#3', 'G3', 'G#3','A3', 'B-3','B3',
               'C4', 'C#4','D4', 'E-4', 'E4', 'F4', 'F#4','G4', 'A4', 'B-4', 'B4',
               'C5', 'C#5','D5', 'E-5','E5', 'F5', 'F#5', 'G5', 'A5', 'B-5', 'B5']

nr = piece.notes(combineUnisons ='combineUnisons', combineRests = 'combineRests').fillna('-')  
nr = nr.apply(pd.Series.value_counts).fillna(0).astype(int).reset_index().copy()  
nr.rename(columns = {'index':'pitch'}, inplace = True)  
nr['pitch'] = pd.Categorical(nr["pitch"], categories=pitch_order)  
nr = nr.sort_values(by = "pitch").dropna().copy()  
voices = nr.columns.to_list() 
display(nr)
px.bar(nr, x="pitch", y=voices, title="Distribution of Pitches in " + 
       piece.metadata['composer'] + ": " + piece.metadata['title'])

Unnamed: 0,pitch,Superius,Altus,Tenor,Bassus
24,G2,0,0,0,15
1,A2,0,0,0,12
4,B-2,0,0,0,24
9,C3,0,0,0,29
12,D3,0,9,14,62
15,E-3,0,2,1,23
18,E3,0,1,12,10
22,F3,0,6,37,30
20,F#3,0,2,2,1
25,G3,0,41,71,32


### C. 3 Durations and Time Signatures
* We can use **`piece.durations()`** to tell us more about rhythms, and then combine the two dataframes into a synoptic view of the pitches and durations of the given piece.  Again, it is helpful to define this request as as variable that we can use later: **`dur = piece.durations().fillna('-')`**
<br>
* And of course we could **apply any of the tools noted above**, counting, sorting, etc, as needed.

* **Time Signatures** display each change of time signature with `piece.timeSignatures()`.  
* To see the **measure/beat index**, pass this to `piece.detailIndex()`:

>`ts = piece.timeSignatures()
piece.detailIndex(ts)`

In [5]:
ts = piece.timeSignatures()
piece.detailIndex(ts)

Unnamed: 0_level_0,Unnamed: 1_level_0,Superius,Altus,Tenor,Bassus
Measure,Beat,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,1.0,4/2,4/2,4/2,4/2


In [6]:
dur = piece.durations().fillna('-')
dur

Unnamed: 0,Superius,Altus,Tenor,Bassus
0.0,48.0,32.0,8.0,16.0
8.0,-,-,4.0,-
12.0,-,-,6.0,-
16.0,-,-,-,8.0
18.0,-,-,2.0,-
...,...,...,...,...
1118.0,4.0,-,-,-
1120.0,-,-,20.0,20.0
1122.0,1.0,-,-,-
1123.0,1.0,-,-,-


### C.4 Combining Notes and Durations in a Single Data Frame

* Two or more data frames can be combined into one. Here we can combine **`nr`** (our Notes and Rests) with **`dur`** to make a single data frame.  This frame can it self be given a new name:  

>`combined_notes_durs = pd.concat([nr, dur], axis=1)`


In [7]:
nr = piece.notes()
dur = piece.durations()
combined_notes_durs = pd.concat([nr, dur], axis=1).fillna('-')
combined_notes_durs.head()

Unnamed: 0,Superius,Altus,Tenor,Bassus,Superius.1,Altus.1,Tenor.1,Bassus.1
0.0,Rest,Rest,A3,Rest,48.0,32.0,8.0,16.0
8.0,-,-,D3,-,-,-,4.0,-
12.0,-,-,A3,-,-,-,6.0,-
16.0,-,-,-,D3,-,-,-,8.0
18.0,-,-,A3,-,-,-,2.0,-


* We can also reorder the columns to put the information for each voice together.  Here we will use Pandas **iloc**, which is a way to refer to row or column by it's **index** number.  

* In Pandas the first row (or column) is **`0`**.  So to see just the notes_rests and durations for the Superius:
>`combined_notes_durs.iloc[:, [0, 4]]`

* To see all the voices reorganized in this way:  

>`combined_notes_durs.iloc[:, [0, 4, 1, 5, 2, 6, 3, 7]]`

In [8]:
combined_notes_durs.iloc[:, [0, 4, 1, 5, 2, 6, 3, 7]]

Unnamed: 0,Superius,Superius.1,Altus,Altus.1,Tenor,Tenor.1,Bassus,Bassus.1
0.0,Rest,48.0,Rest,32.0,A3,8.0,Rest,16.0
8.0,-,-,-,-,D3,4.0,-,-
12.0,-,-,-,-,A3,6.0,-,-
16.0,-,-,-,-,-,-,D3,8.0
18.0,-,-,-,-,A3,2.0,-,-
...,...,...,...,...,...,...,...,...
1118.0,G4,4.0,-,-,-,-,-,-
1120.0,-,-,-,-,A3,20.0,D3,20.0
1122.0,F#4,1.0,-,-,-,-,-,-
1123.0,E4,1.0,-,-,-,-,-,-


### C.5 Measures and Beats

* Music21(and therefore CRIM Intervals) measures time according to **offsets** (one offset = one quarter note). The very first offset in any piece is **0**. 

* Of course human readers will prefer identifying locations by **measure + beat addresses**
<br>

* To do this we 'pass' a name representing the first set of results **`combined_notes_durs`** to another method, **`detailIndex`**.  Thus:

>`piece.detailIndex(combined_notes_durs)`

* if you also would like to see the offsets, we include an additional "argument" in the parentheses: 

>` piece.detailIndex(combined_notes_durs, offset=True`

In [20]:
meas_beat = piece.detailIndex(combined_notes_durs)
meas_beat


Unnamed: 0_level_0,Unnamed: 1_level_0,Superius,Altus,Tenor,Bassus,Superius,Altus,Tenor,Bassus
Measure,Beat,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,1.0,Rest,Rest,A3,Rest,48.0,32.0,8.0,16.0
2,1.0,-,-,D3,-,-,-,4.0,-
2,3.0,-,-,A3,-,-,-,6.0,-
3,1.0,-,-,-,D3,-,-,-,8.0
3,2.0,-,-,A3,-,-,-,2.0,-
...,...,...,...,...,...,...,...,...,...
140,4.0,G4,-,-,-,4.0,-,-,-
141,1.0,-,-,A3,D3,-,-,20.0,20.0
141,2.0,F#4,-,-,-,1.0,-,-,-
141,2.5,E4,-,-,-,1.0,-,-,-
