# Automatic Tuning of SAMPLE hyperparameters
In this notebook we will see how to automatically tune the hyperparameters of SAMPLE

## Setup

### Libraries
Install the `sample` package and its dependencies.
The extras will install dependencies for helper functions such as plots

In [None]:
import sys
!$sys.executable -m pip install -qU lim-sample[notebooks,plots]==2.2.0
import sample

sample(logo=dict(size_inches=6))

### Load audio
Download the test audio or load your own audio file. In this notebook, you can specify

   - a filename: to load the audio from file
   - a URL: to download the audio file from the web (only if fname is empty)
   - start time and length (in seconds): to cut the audio file

In [None]:
import io

import numpy as np
import requests
from IPython import display as ipd
from matplotlib import pyplot as plt
from scipy.io import wavfile


def resize(diag: float = 8.485, aspect: float = 1, shape=(1, 1)):
  plt.gcf().set_size_inches(
      diag * np.true_divide([aspect, 1], np.sqrt(aspect * aspect + 1)) *
      np.flip(shape))


fname = ""  #@param {type: "string"}
url = "https://gist.github.com/ChromaticIsobar/dcde518ec070b38312ef048f472d92aa/raw/3a69a5c6285f4516bae840eb565144772e8809ae/glass.wav"  #@param {type: "string"}
start_time = 7.65  #@param {type: "number"}
time_length = 2.56  #@param {type: "number"}

if fname:
  fs, x = wavfile.read(fname)
else:
  r = requests.get(url)
  with io.BytesIO() as buf:
    buf.write(r.content)
    del r
    buf.seek(0)
    fs, x = wavfile.read(buf)
x = x / -np.iinfo(x.dtype).min

i_0 = int(start_time * fs)
i_1 = i_0 + int(time_length * fs)

x = x[i_0:i_1]
t = np.arange(x.size) / fs

ipd.display(ipd.Audio(x, rate=fs))

plt.plot(t, x, alpha=.5, zorder=100)
plt.grid()
resize(aspect=16 / 9)

## Define optimization problem

We want to find a set of parameter values for the `SAMPLE` algorithm such that it produces an output audio as similar as possible to the input.  
Let's list all available parameters.

We can define some parameters to be fixed, and not be tuned by the optimizer.
We will put a limit on the maximum number of synthesized modes (`max_n_modes=64`) to avoid excessive overfitting.

In [None]:
from sample.ipython import CollapsibleModelParams
import sample

base_model = sample.SAMPLEBeatsDROP(
    max_n_modes=32,
    sinusoidal__tracker__strip_t=0.5,
    sinusoidal__tracker__peak_threshold=-60.0,
    sinusoidal__tracker__reverse=True,
    sinusoidal__tracker__frequency_bounds=(100, 20e3),
)
CollapsibleModelParams(base_model)

Define the space of the parameters to be tuned. We will automatically adjust
 - the logarithm of the FFT size
 - the number of sinusoidal peaks per window
 - the threshold for peak detection
 - the minimum trajectory duration

In [None]:
import skopt.space

sample_opt_space = dict(
    sinusoidal__log_n=skopt.space.Integer(6, 14, name="log2(n)"),
    sinusoidal__tracker__max_n_sines=skopt.space.Integer(32,
                                                         256,
                                                         name="n sines"),
    sinusoidal__t=skopt.space.Real(-120, -45, name="fft threshold"),
    sinusoidal__tracker__min_sine_dur=skopt.space.Real(0,
                                                       0.5,
                                                       name="min duration"),
)

We will use the cochleagram to define an objective function.  
The difference between the input audio's cochleagram and the output's will quantify how dissimilar are the two sounds.  
This is the value we want the optimizer to minimize.

In [None]:
from sample.evaluation.metrics import CochleagramLoss
from sample.utils.dsp import complex2db
from functools import partial

cochleagram_loss = CochleagramLoss(fs=fs,
                                   normalize=True,
                                   analytical="ir",
                                   stride=int(fs * 0.008),
                                   postprocessing=partial(complex2db,
                                                          floor=-60,
                                                          floor_db=True))

## Optimize
Run the optimization procedure.
Depending on the number of iterations, this could take a couple or more minutes

In [None]:
from tqdm import tqdm_notebook
import sample.optimize

#@markdown Check this to restart the optimization from scratch
reset = True  #@param {type:"boolean"}
#@markdown ---
#@markdown Number of optimization iterations
n_minimizing_points = 32  #@param {type:"integer"}
#@markdown Number of exploratory iterations
n_initial_points = 32  #@param {type:"integer"}
#@markdown ---
#@markdown Random seed
seed = 42  #@param {type:"integer"}

# Setup optimizer
n_calls = n_minimizing_points + n_initial_points
if reset or "opt_res" not in locals():
  opt_res = None
sample_opt = sample.optimize.SAMPLEOptimizer(
    model=base_model,
    loss_fn=cochleagram_loss,
    **sample_opt_space,
)

# This is only needed to make the progressbar
tqdm_cbk = sample.optimize.TqdmCallback(
    sample_opt=sample_opt,
    n_calls=n_calls,
    n_initial_points=n_initial_points,
    tqdm_fn=tqdm_notebook,
)

opt_model, opt_res = sample_opt.gp_minimize(x=x,
                                            fs=fs,
                                            n_calls=n_calls,
                                            n_initial_points=n_initial_points,
                                            callback=tqdm_cbk,
                                            initial_point_generator="lhs",
                                            acq_func="LCB",
                                            state=opt_res,
                                            random_state=seed,
                                            fit_kws=dict(n_jobs=6))

### Listen back
Listen to an additive resynthesis of the sound based on the estimated modal parameters.
You can change the number of synthesized modes.

In [None]:
from sample.ipython import LabelAndPlayForeach
from sample import plots

#@markdown Number of modes for resynthesis
n_modes = 32  #@param {type:"integer"}

n_modes_old = opt_model.get_params()["max_n_modes"]
opt_model.set_params(max_n_modes=n_modes)

fig, axs = plots.resynthesis(
    x,
    models={"SAMPLE": opt_model},
    db_floor=-120,
    foreach=LabelAndPlayForeach(audio_kws=dict(rate=fs)))
opt_model.set_params(max_n_modes=n_modes_old)
axs[0].set_ylim(-1.05, 1.05)
resize(aspect=1, shape=(2, len(axs) - 1))