### Notebook Preliminaries 

Make equation numbers work in Latex markdown.

In [1]:
%%javascript
MathJax.Hub.Config({
    TeX: { equationNumbers: { autoNumber: "AMS" } }
});
MathJax.Hub.Register.StartupHook("TeX AMSmath Ready", function () {
  var AMS = MathJax.Extension['TeX/AMSmath'];
  MathJax.InputJax.TeX.postfilterHooks.Add(function (data) {
    var jax = data.script.MathJax;
    jax.startNumber = AMS.startNumber;
    jax.eqLabels = AMS.eqlabels;
    jax.eqIDs = AMS.eqIDs;
  });
  MathJax.InputJax.TeX.prefilterHooks.Add(function (data) {
    var jax = data.script.MathJax;
    if (jax.startNumber != undefined) {
      AMS.startNumber = jax.startNumber;
      Object.keys(jax.eqLabels).forEach(function (x) {delete AMS.labels[x]});
      Object.keys(jax.eqIDs).forEach(function (x) {delete AMS.IDs[x]});
    }
  }, 1);
});

<IPython.core.display.Javascript object>

## Algorithm

#### Global constants across different experiments

- Minimum frequency to measure: 60Hz
- Hop: 0.01s between measurements
- Maximum frequency: a default value when the algorithm gets confused
- Minimum singing sound volume: 0.05

In [2]:
def min_f0():
    return 60

def hop_ms():
    return 10

def max_f0():
    return 2000

def min_singing_volume():
    return 0.05


### Main steps

The McLeod algorithm, like Yin, works in the "time domain", i.e., the original audio. The first two steps are the same as for Yin except that McLeod uses the true autocorrelation.

- Use the method in the [paper of McLeod and Wyvill](https://drive.google.com/file/d/1n228Ly4G4MuCBmXyNw9XH9smGhTCfRlc/view?usp=sharing) to detect the (one) fundamental frequency of the voice in the audio
    - Step 1. True autocorrelation.
    - Step 2. Difference function.
    - Step 3. Normalized difference function (NDF): makes the minimum clearer (removes need for a min period)
    - Step 4. 

### Notes

Notes on development.

- 8/5/20
    - First version.

### Step 1. Simple autocorrelation

The paper uses the usual definition of autocorrelation:

\begin{equation}
r_t(\tau) = \sum^{t+W-1-\tau}_{j=t} x_j x_{j+\tau}
\end{equation}

For reference Yin uses the following formula as its default:

\begin{equation}
r_t(\tau) = \sum^{t+W-1}_{j=t} x_j x_{j+\tau}
\end{equation}

#### Define helper functions and a function to calculate f0 at equally spaced sample points in an audio file

In [10]:
import scipy.signal as ss
import scipy.fft
import librosa
import numpy as np
import matplotlib.pyplot as plt

# initial low-pass filter
def lowpass_filter(y):
    return ss.convolve(y, np.ones(48), mode='same', method='auto')

# autocorrelation function - Eqn (1) above
#  - computed by fft, squared, inverse fft'd
def acf(y, W, t=0):
    return np.real(scipy.fft.ifft(np.square(np.abs(scipy.fft.fft(pad)))))
    
#
# - Does something only if acf.size >= W and t>0
def acf_incremental(y, W, t, acf):
    if t <= 0 or acf.size < W:
        return None
    try:
        return acf + y[t+W-1]*y[t+W-1:t+2*W-1] - y[t-1]*y[t-1:t+W-1]
    except ValueError:
        print(acf.size, t, W, y.size)
            
# returns a ramp used to dampen the autocorrelation function
def get_ramp(tau_max):
    return 1-np.array(range(tau_max,))/tau_max

# This relationship is from the paper.
def W_ms():
    return 1000//min_f0()

# returns a batch of autocorrelation functions indexed by (t, tau)
def get_acf_in_range(y, W, start, end):
    acf = np.ndarray((end-start, W))
    
    for t in range(start, end): 
        if t==start:
            acf[t-start,:] = acf1_from_scratch(y, W, t)
        else:
            acf[t-start,:] = acf1_incremental(y, W, t, acf[t-start-1,:])
            
    return acf

# finds f0 from an autocorrelation function
# - for now, finds the second strongest wavelength of the acf, after 0:
#   - Assume f0<min_f0, so wavelength >sr/min_f0
#   - Find argmax(smooth_acf(tau)), tau > sr/min_f0
# - acf: a list of acf(tau) for tau = 0,..., W-1
def get_f0(acf, sr):
    return sr//(sr//max_f0() + np.argmax(acf[sr//max_f0():]))

### Step 2. Difference function

From the Yin paper, the "signal model" for a periodic signal of period $T$

\begin{equation}
x_t - x_{t+T} = 0,~\forall T
\end{equation}

suggests minimizing the sum of squares of differences 

\begin{align}
d_t(\tau) & = \sum^{t+W-\tau - 1}_{j=t} (x_j - x_{j+\tau})^2 \\
          & = m_t(\tau) - 2r_t(\tau),
\end{align}

where

\begin{equation}
m_t(\tau) = \sum^{t+W-\tau - 1}_{j=t} x^2_j + x^2_{j+\tau}.
\end{equation}

#### Note

We need only compute $d_t(\tau)$ for $t = $ a multiple of `hop_ms()`, or every $sr/100$ samples.


### Step 2, revised. Symmetric Difference function

To average nearby frequencies at time $t$ for each $\tau$, the paper uses a sum symmetric around $t$:

\begin{align}
d_t(\tau) & = \sum^{t+(W-\tau)/2 - 1}_{j=t-(W-\tau)/2} (x_j - x_{j+\tau})^2 \\
          & = m_t(\tau) - 2r_t(\tau),
\end{align}

where

\begin{equation}
m_t(\tau) = \sum^{t+(W-\tau)/2 - 1}_{j=t-(W-\tau)/2} x^2_j + x^2_{j+\tau}.
\end{equation}

### Step 3. Normalized Squared Difference Function (NSDF)

The normalized difference function is

\begin{align}
n_t(\tau) & = 1 - \frac{d_t(\tau)}{m_t(\tau)} \\
          & = \frac{2r_t(\tau)}{m_t(\tau)}.
\end{align}


#### We can compute the autocorrelation $r_t(\tau)$ fast using FFT.

#### Computing $m_t(\tau)$ fast:

Let's use the recursion

\begin{equation}
m_t(\tau) = m_{t-1}(\tau) + x^2_{t+\frac{W-\tau}{2}-1} - x^2_{t-\frac{W-\tau}{2}-1}
                          + x^2_{t+\frac{W-\tau}{2}+\tau-1} - x^2_{t-\frac{W-\tau}{2}+\tau-1} 
\end{equation}

to fill the 1D array $\{m(t, \tau): \tau = 0, 1, \ldots, \tau_{max}\}$, for each $t = 0, 1, \ldots$. 

### Step 4. Threshold for minimum of difference function

Choose the smallest local minimum value of $d'$ below a threshold.

The paper chooses a threshold of $0.1$.

I've modified that to a threshold of $min(0.1,2*{\text global min})$ because the Yin algorithm picks up higher frequencies from the larynx mic during a glissando.
