# AMT Error Matching <a class="tocSkip">

We provide the `measure_errors` script to match the proportion of errors present
in a given set of transcriptions.

The script outputs a `config.json` file which can either be passed to the
`make_dataset` script to create a static ACME dataset of the given proportions,
or it can be used to instantiate a `Degrader`, which can create degraded data
on-the-fly.

# Create ACME

We'll first create a small ACME dataset using only the PianoMidi data. For this example, we'll use the degraded data as our "transcribed" data.

In [None]:
! python ../make_dataset.py --datasets PianoMidi --no-prompt --seed 42

# Load the Metadata

We'll need to load the dataset's metadata in order to use it as our pseudo-transcriptions.

In [None]:
import os
import pandas as pd

degradation_ids = pd.read_csv(os.path.join('acme', 'degradation_ids.csv'))

metadata = pd.read_csv(os.path.join('acme', 'metadata.csv'))
metadata.head(10)

# Organize the "AMT" Output

In [None]:
import shutil
import os
from tqdm import tqdm

trans_dir = os.path.join("error_matching", "trans")
gt_dir = os.path.join("error_matching", "gt")

os.makedirs(trans_dir, exist_ok=True)
os.makedirs(gt_dir, exist_ok=True)

for idx, row in tqdm(metadata.iterrows(), total=len(metadata)):
    shutil.copy(os.path.join('acme', row.clean_csv_path), gt_dir)
    shutil.copy(os.path.join('acme', row.altered_csv_path), trans_dir)

Now, error_matching contains two directories:
* `gt`: Contains the ground truth clean excerpts.
* `trans`: Contains the "transcribed" excerpts.

Matching ground truth and transcribed files _must_ have the same basename
in their respective directories. The files can be CSV, MIDI, or pickle (see below).

# Run the Measure Errors script

In [None]:
! python ../measure_errors.py --trans error_matching/trans --gt error_matching/gt --excerpt-length 15000 --min-notes 5

The `measure_errors` script can be run on full piece transcriptions, so you can set an `excerpt-length` and `min-notes` via command line arguments. In our case, since we are running in only on excerpts, we set the `excerpt-length` to be longer than all of the excerpts, and the minimum number of notes to be smaller than all excerpts.

For the full usage, see below.

In [None]:
import json

from mdtk.degradations import DEGRADATIONS

with open("config.json", "r") as json_file:
    config = json.load(json_file)

print("Degradation probabilities:")
for prob, deg in zip(config['degradation_dist'], DEGRADATIONS.keys()):
    print(f"{deg.rjust(14)}: {prob}")

print()
print(f"Clean_prop: {config['clean_prop']}")

Notice that the probabilities are not all `1/9`, as might be expected from the ACME creation.
This is because the measured probabilities only find one possible path from each transcription
to its ground truth. In our case, it seems that many of the `time_shifts` are classified as
a `remove_note` and an `add_note`. For the `measure_errors` script to classify an error as a `time_shift`,
the shift length must be smaller than the duration of the shifted note. Without giving any arguments
to `make_dataset`, this is unlikely.

Let's take a few examples specifically. First, let's pick a random pitch shift, and plot it.

In [None]:
from mdtk.fileio import csv_to_df

from utils import plot_from_df

def get_random_dfs(deg_name, metadata):
    deg_id = degradation_ids.loc[degradation_ids["degradation_name"] == deg_name, "id"].values[0]
    meta_df = metadata.loc[metadata["degradation_id"] == deg_id]

    row = meta_df.sample()
    basename = os.path.basename(row["clean_csv_path"].values[0])
    gt_df = csv_to_df(os.path.join(gt_dir, basename))
    trans_df = csv_to_df(os.path.join(trans_dir, basename))

    return gt_df, trans_df

In [None]:
gt_df, trans_df = get_random_dfs("pitch_shift", metadata)

plot_from_df(gt_df)

In [None]:
plot_from_df(trans_df)

Now, let's measure the degradation present in the selected excerpt.

In [None]:
import sys
sys.path.append("..")

from measure_errors import get_excerpt_degs

deg_counts = get_excerpt_degs(gt_df, trans_df)

print("Degradation counts:")
for count, deg in zip(deg_counts, DEGRADATIONS.keys()):
    print(f"{deg.rjust(14)}: {count}")

Now we can try the same for time_shift:

In [None]:
gt_df, trans_df = get_random_dfs("time_shift", metadata)

plot_from_df(gt_df)

In [None]:
plot_from_df(trans_df)

In [None]:
deg_counts = get_excerpt_degs(gt_df, trans_df)

print("Degradation counts:")
for count, deg in zip(deg_counts, DEGRADATIONS.keys()):
    print(f"{deg.rjust(14)}: {count}")

# Creating a Custom Dataset

You can feed the generated `config.json` file to `make_dataset.py` in order to generate a custom ACME dataset matching the measured degradation and clean proportions:

```bash
  python ../make_dataset.py --datasets PianoMidi --no-prompt --seed 42 --config config.json
```

# Input file types

The measure_errors script can read `MIDI`, `CSV`, or `pickle` files:
* `MIDI`: Any MIDI file.
* `CSV`: Any CSV file generated by mdtk (see mdtk/fileio.py).
* `pickle`: A pickle file containing a single numpy array called `piano_roll`, of shape either `num_frames x num_pitches`, or `num_frames x (2 * num_pitches)`, in which case the first `num_pitches` columns are a note presence piano roll and the last `num_pitches` columns are a corresponding onset piano roll. The min and max pitch can be set using `--pr-min-pitch` and `--pr-max-pitch`.

# Command line arguments
The measure_errors script has many other command line arguments. A full list is below, but we will highlight a few of the most useful ones here:

* `--trans` and `--gt`: Directories in which the script will look for matching ground truths and transcriptions. Any files which match in basename (not including extension) will be treated as matches. For example, the ground truth 'file1.mid' will match the transcription 'file1.csv'. Include `-r` to search directories recursively.
* `--trans_start` and `--trans_end`: If the transcriptions are only partial transcriptions of the ground truths, these arguments can be used to set the bounds of the transcriptions (in ms). For example, if only the first 30 seconds are transcribed, use `--trans_end 30000`.
* `--excerpt-length`: Each transcription is split into excerpts of this length before errors are measured. This should be set to the length of excerpt which you plan to send to your model. Shorter values will lead to more accurate error measurements, but longer values may contain long-range patterns that aid in modelling.

In [None]:
! python ../measure_errors.py -h

In [None]:
# Clean up
! python ../make_dataset.py --no-prompt --clean

import shutil
import os

shutil.rmtree("acme")
shutil.rmtree("error_matching")
os.remove("config.json")