# MATCH

This notebook provides wrapper functions for calling the [MATCH algorithm](https://www.eecs.qmul.ac.uk/~simond/match/).  Running this algorithm requires installing some other software, which is described below.  This notebook implements the `offline_processing()` and `online_processing()` functions, which will be imported and run in `02_RunExperiment.ipynb`.

Here is a summary of the MATCH approach:
- Offline processing: The orchestra and full mix recordings are aligned with standard DTW using chroma features.
- Online processing: The solo piano and full mix recordings are aligned with the MATCH algorithm, and the predicted alignment is then used to infer the corresponding alignment between the piano and orchestra recordings.

## Offline Processing

The offline processing is the same as in the simple offline DTW system.  In the offline processing stage, three things are computed and stored in the `cache/` folder:
- chroma features for the orchestra recording
- chroma features for the full mix recording
- predicted DTW alignment between the orchestra and full mix recordings



In [None]:
import numpy as np
import pandas as pd
import import_ipynb
import System_SimpleOfflineDTW
import system_utils
import os
import os.path

In [None]:
def offline_processing(scenario_dir, cache_dir, hop_length, steps, weights):
    '''
    Carries out the same offline processing steps as the simple offline DTW system.
    
    Inputs
    scenario_dir: The scenario directory to process
    cache_dir: The location of the cache directory
    hop_length: The hop length in samples used when computing chroma features
    steps: an L x 2 array specifying the allowable DTW transitions
    weights: a length L array specifying the DTW transition weights
    
    This function will store the computed chroma features and estimated alignment in the cache folder.
    '''
    System_SimpleOfflineDTW.offline_processing(scenario_dir, cache_dir, hop_length, steps, weights)
    
    return

## Online Processing

In the online processing stage, we do two things:
1. compute an online alignment between the piano and full mix recordings using MATCH,
2. use the predicted alignment to infer the alignment between the piano and orchestra recordings

Note that step 1 is completed before we begin step 2.  This implementation is thus not a valid online system, but its performance nonetheless can tell us how well an online system would perform.

### Software Installation

Using the MATCH algorithm requires a few pieces of software to be installed:
- [Sonic Annotator](https://vamp-plugins.org/sonic-annotator/), a program for command-line processing of audio files
- [the MATCH Vamp plugin](https://code.soundsoftware.ac.uk/projects/match-vamp/), an implementation of the MATCH algorithm which can be used in tandem with Sonic Annotator
- the SoX command line audio utility tool

Below, we will assume that the `sonic-annotator` and `sox` binaries can be called from command line, and that the MATCH Vamp plugin has been installed.  See [here](https://vamp-plugins.org/download.html#install) for instructions on how to install Vamp plugins.

### Wrapper Implementation

In [None]:
def parse_match_outfile(infile):
    '''
    Parses the MATCH csv output file specifying the estimated alignment.
    
    Inputs
    infile: filepath to the MATCH csv output file
    
    Returns a 2xN array indicating the estimated alignment in seconds.
    '''
    d = pd.read_csv(infile, header=None)
    return np.vstack((d.loc[:,1], d.loc[:,2]))

In [None]:
def online_processing(scenario_dir, out_dir, cache_dir, hop_sec):
    '''
    Carries out `online' processing using the MATCH algorithm.
    
    Inputs
    scenario_dir: The scenario directory to process
    out_dir: The directory to put results, intermediate files, and logging info
    cache_dir: The cache directory
    hop_sec: The hop size in sec used in the offline DTW stage

    This function will compute and save the predicted alignment in the output directory in a file hyp.npy
    '''
    
    # verify & setup
    system_utils.verify_scenario_dir(scenario_dir)
    system_utils.verify_cache_dir(cache_dir)
    assert not os.path.exists(out_dir)
    os.makedirs(out_dir)
           
    # determine the start time of the query in the orchestra recording (ground truth)
    orch_start_sec = system_utils.get_orchestra_start(scenario_dir)
    
    # infer the start time of the query in the full mix recording (estimated)
    wp_BC_frm = np.load(f'{cache_dir}/po_o_align.npy')
    wp_BC_sec = wp_BC_frm * hop_sec
    fullmix_start_sec = np.interp(orch_start_sec, wp_BC_sec[1,:], wp_BC_sec[0,:])
    
    # create audio recording of full mix starting at the start location
    fullmix_orig_filepath = f'{scenario_dir}/po.wav'
    fullmix_mod_filepath = f'{out_dir}/po_mod.wav'
    os.system(f'sox {fullmix_orig_filepath} {fullmix_mod_filepath} rate 44100 trim {fullmix_start_sec}')
    
    # run MATCH plugin
    piano_filepath = f'{scenario_dir}/p.wav'
    match_align_filepath = f'{out_dir}/match_p_po.out'
    os.system(f'sonic-annotator -d vamp:match-vamp-plugin:match:b_a -m {piano_filepath} {fullmix_mod_filepath} -w csv --csv-stdout > {match_align_filepath}')    
    
    # infer piano-orchestra alignment
    wp_AB_sec = parse_match_outfile(match_align_filepath) # piano-fullmix alignment
    wp_AB_sec[1,:] = wp_AB_sec[1,:] + fullmix_start_sec # account for offset
    wp_AC_sec = system_utils.infer_alignment(wp_AB_sec, wp_BC_sec) 
    np.save(f'{out_dir}/hyp.npy', wp_AC_sec)

    return

# Example

Here is an example of how to call the offline and online processing functions on a scenario directory.

In [None]:
# scenario_dir = 'scenarios/s2'
# out_dir = 'experiments/test/s2'
# cache_dir = 'experiments/test/cache'
# hop_size = 512
# steps = np.array([1,1,1,2,2,1]).reshape((-1,2))
# weights = np.array([2,3,3], dtype=np.float64)
# #offline_processing(scenario_dir, cache_dir, hop_size, steps, weights)
# online_processing(scenario_dir, out_dir, cache_dir, hop_size / 22050)