# Overview of Systems

The unused/ folder contains systems that we experimented with, but did not lead to promising results.  This notebook gives an overview of these systems.

## System_SeparatedMATCH

Summary:
- Assumes source separation has been performend on the PO recordings as a preprocessing step
- Aligns O against the estimated O component of the PO recording using subsequence DTW with chroma features
- Aligns P against the estimated P component of the PO recording using MATCH.  Assumes that we know the initial starting location in the O recording (and can estimate the corresponding location in the PO recording).
- Infers P-O alignment

Main findings:
- This system is a reasonable baseline, but was abandoned because we switched our focus to a purely offline formulation of the problem.  It has been replaced by System_SeparatedDTW.ipynb

## System_MATCH

Summary:
- Aligns O against the PO recording using subsequence DTW with chroma features
- Aligns P against the PO recording using MATCH.  Assumes that we know the initial starting location in the O recording (and can estimate the corresponding location in the PO recording).
- Infers P-O alignment

Main findings:
- This system is a reasonable baseline, but was abandoned because we switched our focus to a purely offline formulation of the problem.  It has been replaced by System_NaivePairwiseDTW.ipynb

## System_PairwiseFSVEDTW

Summary:
- Aligns O against the PO recording using subsequence DTW with chroma features
- Aligns P against the PO recording using fixed-start variable-end DTW.  Assumes that we have oracle knowledge about the initial starting location in the O recording (and can estimate the corresponding location in the PO recording).
- Infers P-O alignment

Main findings:
- This system is a reasonable baseline, but was abandoned because we switched our focus to a purely offline formulation of the problem.  It has been replaced by System_NaivePairwiseDTW.ipynb

## System_OnlineGreedyMixtureDTW

Summary:
- Tries to find a good path through a 3D cost tensor based on a mixture cost metric, where elements indicate the dissimilarity between P+O and PO frames
- It does not actually compute a 3D cost tensor, however, but instead computes elements of the cost tensor on-the-fly, and then greedily selects the next best step

Main findings:
- We did not find any promising results with this idea.

## System_MixtureDTW3D

Summary:
- Computes a 3D cost tensor based on a mixture cost metric, where elements indicate the dissimilarity between P+O and PO features
- Finds the optimal path through the 3D cost tensor using dynamic programming, where we assume that the path starts at the origin and ends at the opposite corner of the tensor

Main findings:
- Doing dynamic programming through a 3D cost tensor is very computationally expensive, so we had to downsample the features in order to achieve reasonable runtimes.
- With the downsampled features, we did not get any good results.

## System_MixtureSubseqDTW3D

Summary:
- First computes P-PO alignment using subsequence DTW with chroma features.  The purpose of this is to select the matching region of PO, in order to reduce the size of the 3D cost tensor in the next step.
- Computes a 3D cost tensor based on a mixture cost metric between P, O, and PO-match.  Elements of the cost tensor indicate the dissimilarity between P+O and PO features
- Finds the optimal path through the 3D cost tensor using a subsequence 3D alignment algorithm, where we allow the path to start and end anywhere in the O recording.

Main findings:
- Doing dynamic programming through a 3D cost tensor is very computationally expensive, so we had to downsample the features in order to achieve reasonable runtimes.
- With the downsampled features, we did not get any good results.

## System_MixtureFlexDTW3D

Summary:
- Computes a 3D cost tensor based on a mixture cost metric between P, O, and PO.  Elements of the cost tensor indicate the dissimilarity between P+O and PO features
- Finds the optimal path through the 3D cost tensor using a 3D FlexDTW alignment approach, where we allow the path to start anywhere on the three beginning faces and can end anywhere on the 3 opposite faces.

Main findings:
- Doing dynamic programming through a 3D cost tensor is very computationally expensive, so we had to downsample the features in order to achieve reasonable runtimes.
- With the downsampled features, we did not get any good results.

## System_AltPairwiseMixDTW

Summary:
- Estimate initial P-PO and O-PO alignments using subsequence DTW with chroma features
- Assuming the P-PO alignment is fixed, re-estimate the O-PO alignment.  This is done by adding the aligned P features to the O features, comparing to the PO features, and performing 2D subsequence DTW.
- Assuming the O-PO alignment is fixed, re-estimate the P-PO alignment.  This is done by adding the aligned O features to the P features, comparing to the PO features, and performing 2D subsequence DTW.
- Repeats steps 2 and 3 until convergence

Main findings:
- This approach is computationally very expensive since it requires running multiple iterations
- The results were no better than a naive pairwise approach.  The results only got better by a miniscule amount.

## System_ExplBasedMixDTW

Summary:
- Estimate the P-PO alignment using subsequence DTW with chroma features
- Using the estimated alignment, calculate a baseline similarity between PO-match and the corresponding P features
- Assuming a fixed P-PO alignment, calculate a 2D cost matrix using a mixture-based loss which indicates dissimilarity between O+P and PO
- Calculate the difference between the mixture similarity (in step 3) and the baseline similarity (in step 2).  This describes how much adding O features improves our explanation of the PO features, compared to only using the P features.
- Perform subsequence DTW through this 2D cost matrix.

Main findings:
- This approach is computationally very expensive and produced results worse than naive pairwise DTW.

## System_SeparatedSparse.ipynb

Summary:
- Perform source separation on the PO recording
- Estimate the O-PO alignment by aligning O against the estimated O component in PO.  This alignment is done using dense-sparse DTW with selected features from the O recording (sparse) and all frames in the O_est sequence.
- Estimate the P-PO alignment by aligning P against the estimated P component in PO.  Uses standard DTW with chroma features.

Main findings:
- Was worse than using dense-sparse DTW to directly estimate the O-PO alignment (without source separation).

## System_TimeSparse.ipynb

Summary:
- Train a GMM to model MFCC features in the P recording
- Train a GMM to model MFCC features in the O recording
- Classify frames in PO as either P or O using the two GMMs
- Align classified P frames (from PO) against P using dense-sparse DTW with chroma features
- Align classified O frames (from PO) against O using dense-sparse DTW with chroma features
"
Main findings:
- Yielded worse results than the naivePairwiseDTW approach.