# Optimization of vocals processing and transcription parameters
Optimize parameters of audio effects and basic pitch using skopt's implementation of Bayesian optimization, `gp_minimize`.
The optimization is performed on the vocals track of the USDX MIDI file, using the F-measure as the evaluation metric.

## Load reference MIDI file
Load a MIDI file created from a USDX song file and convert the MIDI data to a list of note events for `mir_eval`.

In [None]:
import pretty_midi
from evaluate_midi import prepare_eval_data

# Load reference notes from USDX MIDI file
midi = pretty_midi.PrettyMIDI('../data/out/usdx-midi/song.mid')
original_midi = midi.instruments[0].notes

reference_notes = prepare_eval_data(original_midi)

## Predict notes from vocals track
Predict the notes from the separated vocals track using Basic Pitch.

In [None]:
import tensorflow as tf
from basic_pitch.inference import predict, ICASSP_2022_MODEL_PATH

# Load prediction model to cache it between function calls
basic_pitch_model = tf.saved_model.load(str(ICASSP_2022_MODEL_PATH))

# Generate MIDI from vocals track
def predict_notes(**options):
    model_output, midi_data, note_events = predict(
        "../data/out/optimized_audio/vocals.wav",
        basic_pitch_model,
        minimum_frequency=80,
        maximum_frequency=1000,
        **options
    )
    if len(midi_data.instruments) == 0:
        print("No notes predicted for params %r" % options)
        return []
    return midi_data.instruments[0].notes

## Run optimization
Compare the prediction to the reference notes using `mir_eval`'s F-measure implementation.
Optimize parameters based on the result.

In [None]:
# Evaluation function
from evaluate_midi import evaluate
from skopt import gp_minimize
from skopt.space import Real, Integer

# Monkey patch numpy to avoid skopt error
# (see https://github.com/scikit-optimize/scikit-optimize/issues/1138 for details)
import numpy
numpy.int = int

def evaluate_params(params):
    options = {
        "onset_threshold": params[0], # default: 0.5
        "frame_threshold": params[1], # default: 0.3
        "minimum_note_length": params[2], # default: 127.7
    }
    notes = predict_notes(**options)
    if len(notes) == 0:
        # optimization is trying to minimize the result
        # return max value to indicate invalid result
        return 1
    estimated_notes = prepare_eval_data(notes)
    scores = evaluate(reference_notes, estimated_notes)
    return 1.0 - scores['F-measure']

# Parameter ranges for prediction options
param_ranges = [
    Real(0.2, 0.8, name="onset_threshold"),
    Real(0.2, 0.8, name="frame_threshold"),
    Integer(80, 250, name="minimum_note_length")
]

gp_minimize(evaluate_params, param_ranges, n_calls=100)