# 21M.387 Fundamentals of Music Processing
## Lab5

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
from ipywidgets import interact
import sys
sys.path.append("..")
import fmplib as fmp
from fmplib.pyqueue import connect_to_queue

plt.rcParams['figure.figsize'] = (12, 4)
plt.rcParams['image.interpolation'] = 'nearest'

fmp.documentation_button()

## Exercise 1

For the matrix `m` below, create two vectors:
- the average values along the columns (to produce the average row vector)
- the average values along the rows (to produce the average column vector)

In [None]:
m = np.array(((8, 2, 7, 7), (6., 3, 5, 8), (1, 4, 9, 6)))
print(m)

In [None]:
connect_to_queue()

## Exercise 2
In the following exercise, we will explore the auto-correlation tempogram using this piece of audio:

In [None]:
snd1 = fmp.load_wav("audio/queen_another_one.wav", 0, 30)
fs = 22050
ipd.Audio(snd1, rate = fs)

Generate $\Delta^s[n]$, the spectral novelty function of `snd` using:  
`fmp.spectral_novelty(x, win_len, hop_size, gamma)`

Use the parameters:
- $N=1024$ 
- $H=512$ 
- $\gamma=100$

Plot $\Delta^s[n]$

Now grab a small window of $\Delta^s[n]$ starting at $n=200$, with window length $L=256$. We'll call this signal $x[n]$.

Plot $x[n]$.

Create a function to compute the auto-correlation of a signal $x[n]$ with length $L$:
$$ R_{xx}[l] = \sum_{n=0}^{n=L-1} x[n] \cdot x[n-l] $$

You should assume that $x[n]$ is zero outside the window bounds $n \in [0,L-1]$.

Use a python `for` loop, where you loop over each lag value (from $0$ to $L-1$), computing $ R_{xx}[l]$ using the dot product.

- Find $R_{xx}$ for the windowed signal $x[n]$
- Plot the result

In [None]:
def auto_correlate(x):
    pass


In [None]:
connect_to_queue()

## Exercise 3
Create the same auto-correlation vector, but use the numpy function [`np.correlate`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.correlate.html) to achieve the same result without using a `for` loop. Hint, use the optional argument `mode='full'`.

Plot this to show that both methods produce the same result.

In [None]:
connect_to_queue()

## Exercise 4

Write the function `lag_to_bpm(l, ff)` which takes a lag value `l` and feature frequency `ff`, and returns the BPM (beats per minute) associated with that lag value.

Given the parameters above
- What lag values ($l$) correspond to the first 4 peaks of the AC function (not including $l=0$)? Hint: you can use `fmp.find_peaks` with a threshold of $0.2$
- What tempos (BPM) values do these peaks correspond to?

In [None]:
def lag_to_bpm(l, ff):
    pass


Here is some code for a metronome - it will play 10 clicks at the given BPM. Use this function to listen to the 4 candidate tempos of this song. Which of these tempo estimates are reasonable for this song?

In [None]:
def metronome(bpm) :
    print('tempo =', bpm)
    fs = 22050.
    beats = np.arange(10) * int(60. * fs  / bpm)
    click_snd = fmp.load_wav("audio/click.wav")
    clicks = fmp.make_clicks(beats, click_snd = click_snd)
    display(ipd.Audio(clicks, rate=fs))

In [None]:
connect_to_queue()

## Exercise 5

We will now estimate the tempo values for the same window of audio using the Fourier method.

For the same windowed signal $x[n]$ (of length $L=N=256$), compute $X[k]$ the Fourier Transform (you can use the function `np.fft.rfft`). Remember also to use a Hann window.

Plot $\lvert X[k] \lvert$

Write the function `k_to_bpm(k, ...)` which returns the BPM corresponding to a given `k`. You will need to supply additional arguments to this function.

Given the Fourier Transform above:
- Find the frequency bins ($k$) corresponding to the first 4 prominent peaks (not including 0)? You can use a threshold value of $0.25$.
- What tempo (BPM) values correspond to these values of $k$?

In [None]:
def k_to_bpm(k): # TODO, add additional args to this function as needed
    pass


Now listen to these 4 tempos using the `metronome` function

In [None]:
connect_to_queue()

## Exercise 6

What observations can you make about the tempo values predicted by each method and how they compare to each other?

In [None]:
connect_to_queue()

## Exercise 7

Let's make the Fourier tempo prediction more accurate by zero-padding.

- Apply a Hann window to $x[n]$
- Zero-pad the result to be 8 times as long
- Take the FFT
- Plot the magnitude

As in Exercise 5, find the tempos predicted by the first four peaks. Then compare these to the original Fourier tempo estimates.

In [None]:
connect_to_queue('checkoff')

## Exercise 8

If you have time left, load the beginning to Herbie Hancock's _Chameleon_ below and listen to the first 30 seconds or so.

$\Delta^s[n]$ is computed as above with $N=1024$, $H=512$, $\gamma = 100$ 

Find the candidate tempos of this song using the autocorrelation method and `fmp.find_peaks` threshold of $0.3$ on two different portions of the song:

- a window of length 11 seconds, starting at $t = 0$ seconds.
- a window of length 15 seconds, starting at $t = 15$ seconds.


Look at the first 3 candidate tempos of each portion. Can you explain what each peak "means"? In particular, the first section has an unusual 2nd tempo peak. Why is that?

It will help if you listen to these candidate tempos using the `metronome()` function.

In [None]:
snd2 = fmp.load_wav('audio/hancock_chameleon.wav', 0, 30)
ipd.Audio(snd2, rate = fs)
win_len = 1024
hop_size = 512
gamma = 100
ff = fs / hop_size
nov2 = fmp.spectral_novelty(snd2, win_len, hop_size, gamma)