# Coordination Strategies in Networked Jazz Performances

![image](https://user-images.githubusercontent.com/97224401/232460680-1542d4bb-46e7-4a25-bba0-dd206f61db64.png)

*The two musicians (pianist, left; drummer, right) whose performance we will be modelling.*

## Modelling one performance

This notebook walks through the process of loading in the data for a single performance from the corpus, extracting the relevant features, modelling the interaction within the performance using our phase correction model, running a series of experimental simulations using the model, and finally visualising the results using a variety of in-built plots.

For this example notebook, the performance we've selected to analyse is by the duo 3 (of five), from the first (of two) experimental sessions, with 90 milliseconds of latency and 0.0x jitter scaling.

## 1. Load dependencies, set constants

**Process:**
- Install dependencies that Google CoLab does not install by default;
- Import dependencies that we need when working with our data;
- Set PATH correctly
- Set constant variables (number of simulations, example data)

The following lines of code install the packages that Google CoLab does not contain by default

In [None]:
!git clone https://github.com/HuwCheston/Jazz-Jitter-Analysis
# TODO: we should install all the modules that the default colab notebook needs here

In [None]:
!pip install pretty-midi dill pingouin

Although most of the dependencies are imported within the analysis and visualisation scripts we'll be loading here, we still need to import a few additional dependencies here to have access to the objects we're going to create.

In [None]:
import sys
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

When running this script locally (i.e. from inside the `.\notebooks` directory), we need to add the module root temporarily to our PATH variable in order to access our analysis scripts (located inside `.\src`)

In [None]:
sys.path.append(os.path.abspath("/content/Jazz-Jitter-Analysis"))
print(sys.path)

The following lines of code set dependencies which can be adjusted by the user.

In [None]:
INPUT_DIR = r"/content/Jazz-Jitter-Analysis/notebooks/example_for_notebook"    # Default ..//references//example_for_notebook
MIDI_MAPPING_FPATH = r"/content/Jazz-Jitter-Analysis/references"    # Default ..//references
NUM_SIMULATIONS = 250    # Default 250: increasing this value will demand greater system resources!

## 2. Load and clean the raw data

In [None]:
from src.clean.gen_pretty_midi import gen_raw_midi_output, gen_pm_output
from src.clean.combine import combine_output

The following few lines of code generate our *raw* MIDI data (equivalent to every note played by a performer) and the *BPM* MIDI data (equivalent just to the crotchet beats they played). Both data streams are stored as .MID files created using our AV-Manip software in Reaper.

In [None]:
output = {}
output['midi_raw'] = gen_raw_midi_output(input_dir=INPUT_DIR, midi_mapping_fpath=MIDI_MAPPING_FPATH)
output['midi_bpm'] = gen_pm_output(input_dir=INPUT_DIR, midi_mapping_fpath=MIDI_MAPPING_FPATH)
output

Now, we combine all of our data streams together.

In [None]:
combined = combine_output(
    input_dir=INPUT_DIR, 
    output_dir=INPUT_DIR, 
    dump_pickle=False,
    zoom_arr=np.genfromtxt(r"/content/Jazz-Jitter-Analysis/references/latency_array.csv"),
    **output
)

Note that, for sake of simplicity, we don't extract the questionnaire responses or biometric data for each musician; however, there are functions present in `.\src\clean\` that will do this (and are called when the full model set is created). As a result, we need to add some placeholder variables into our cleaned data, so we don't end up with errors when creating our model.

In [None]:
for p in combined:
    for t in p:
        # We set the questionnaire variables to NoneType
        # This will prevent issues if these variables don't exist
        for val in ['interaction', 'coordination', 'success', 'thoughts']:
            t[val] = None

## 3. Create the model
**Process:**
- Import the `PhaseCorrectionModel` class
- Create the model using our raw keys and drums data
- Print the nearest-neighbour matched data for the pianist
- Print the model summary for the drummer
- Create a summary dataframe combining results from both musicians

In [None]:
from src.analyse.phase_correction_models import PhaseCorrectionModel

Now, we separate our raw data variable into a separate dictionary for both the pianist and drummer. We pass these into our `PhaseCorrectionModel` constructor to create the model. The other arguments in our call to `PhaseCorrectionModel` here are just set to their defaults: however, they are defined so you can try altering them and see what happens.

In [None]:
keys_raw, drms_raw = combined[0]
md = PhaseCorrectionModel(
    # Raw data (don't change this!)
    c1_keys=keys_raw,
    c2_drms=drms_raw,
    # Patsy model to use (don't change unless you know what you're doing)
    model='my_next_ioi_diff~my_prev_ioi_diff+asynchrony',
    # Disable any additional cleaning
    contamination=0.0,
    # Upper and lower bound to filter outlying IOIs
    iqr_filter=(0.1, 0.9),   
    # Size of the window in seconds to use when generating rolled values
    rolling_window_size='2s',
    # Minimum number of periods to use when generating rolled data
    rolling_min_periods=2,
    # Maximum number of seconds to lag the latency array by
    maximum_lag=8,
    # Maximum order (M) to create higher-level phase correction models up to
    higher_level_order=4,
)

Once we've created our model, we can access a few attributes which might be helpful. First, we'll print the first 5 matched onsets (from the keyboard player's perspective) as a Pandas `DataFrame`. For clarity, 'my' refers to the keyboard player, and 'their' to the drummer:

In [None]:
md.keys_nn.head(5)

Next, we'll print a summary of the model itself (as a StatsModels `OLSResults` instance) for the drummer:

In [None]:
md.drms_md.summary()

If you look at the code for `PhaseCorrectionModel`, you'll notice that there are lots of private methods (e.g. `_extract_tempo_slope()`) that carry out particular analyses tasks. Rather than call these methods individually, we can access the pre-computed results using either the `md.keys_dic` or `md.drms_dic` attribute.

In [None]:
md.keys_dic

Finally, we compile a `DataFrame` using both the `md.keys_dic` and `md.drms_dic` attributes together. When working with the full corpus, we can use this to create a nice table, where a single row corresponds to the performance of one musician in one condition. But, for now, we'll just have two rows, one for each performer in our example extract.

In [None]:
df = pd.DataFrame([md.keys_dic, md.drms_dic])
df

## 4. Generate all the simulations
**Process:**
- Create the required simulation paradigms (e.g. `anarchy`, `democracy` etc.)
- Create the simulation objects for each paradigm

First, we need to import our Simulation class:

In [None]:
from src.analyse.simulations import Simulation

We can now proceed to create each of our simulation paradigms. These are:
- `original`: coupling coefficients defined in the model
- `democracy`: both performers coupled to each other at equal rates
- `anarchy`: no adaptation or correction between performers
- `leadership`: pianist coupled to drummer, drummer not coupled to pianist

Our `original` simulation paradigm just uses the coefficients defined in the model, for both performers.

In [None]:
original = df.copy()

Our `democracy` paradigm sets the coupling of both performers precisely equal to each other, to the mean coefficient. We set the intercept to `0` to ensure the stability of the simulation.

In [None]:
democracy = df.copy()
democracy['correction_partner'] = democracy['correction_partner'].mean()
democracy['intercept'] = 0

Our `anarchy` model sets the coupling of both performers to `0`, simulating no adaptation or correction between them. Again, we set the intercept to `0` to ensure the stability of the simulation.

In [None]:
anarchy = df.copy()
anarchy['correction_partner'] = 0
anarchy['intercept'] = 0

Finally, our `leadership` model sets the coupling of the drummer to the pianist to `0` but does not change the coupling of the pianist to the drummer, simulating a leader-follower relationship (with the drummer as the leader). We again set the intercept to `0` to ensure the stability of the simulation.

In [None]:
leadership = df.copy()
leadership['correction_partner'] = np.where(
    leadership['instrument'] == 'Drums', 0,
    leadership[leadership['instrument'] == 'Keys']['correction_partner']
)
leadership['intercept'] = 0

OK, now we can create all of our simulations. The following code iterates over all of the paradigms we defined above, creates the `Simulation` object for each paradigm, creates the desired number of individual simulated performances (defined in the `NUM_SIMULATIONS` constant), then stores the results in our `sims_list` iterable that we'll access when creating our graphs later.

In [None]:
sims_list = []
for md_, param in zip(
    [original, anarchy, democracy, leadership], 
    ['original', 'anarchy', 'democracy', 'leadership'],
):
    sim_ = Simulation(
        # The phase correction model results
        pcm=md_,
        # The number of simulations we'll run, defaults to 250
        num_simulations=NUM_SIMULATIONS,
        # This argument is just used to store the parameter used as a string inside the Simulation instance
        parameter=param, 
        # Tells the simulation to use the original noise term of the model
        use_original_noise=False,
        noise=0.005
    )
    sim_.create_all_simulations()
    sims_list.append(sim_)

## 5. Create some graphs

**Process:**
- Create a plot showing modelled coupling coefficients, relative phase, and tempo
- Create a plot comparing between simulated tempo and asynchrony across paradigms

First, we'll create a nice plot of the individual coupling coefficients obtained for each performer, the relative phase of each performer compared to their partner (whether they play 'in-front' or 'behind'), and the tempo trajectory of their performance.

In [None]:
from src.visualise.phase_correction_graphs import SingleConditionPlot
from src.analyse.analysis_utils import generate_df

In [None]:
g = SingleConditionPlot(
    # Keys data
    keys_df=md.keys_nn,
    keys_md=md.keys_md,
    keys_o=generate_df(md.keys_dic['raw_beats'][0]),
    # Drums data
    drms_df=md.drms_nn,
    drms_md=md.drms_md,
    drms_o=generate_df(md.drms_dic['raw_beats'][0]),
    # Metadata used to create the plot title, etc.
    metadata=(
        3, 1, 90, 0
    )
)
g.create_plot()
plt.show()

Finally, we'll generate a line plot that compares the simulated tempo and asynchrony values obtained across all of our paradigms.

In [None]:
from src.visualise.simulations_graphs import LinePlotAllParameters

In [None]:
g = LinePlotAllParameters(
    simulations=sims_list,
    keys_orig=md.keys_nn,
    drms_orig=md.drms_nn,
    # Metadata used to create the plot title, etc.
    params={
        'trial': 3,
        'block': 1,
        'latency': 90,
        'jitter': 0
    }
)
g.create_plot()
plt.show()