# 21M.387 Fundamentals of Music Processing
## Lab2

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd
from ipywidgets import interact
import sys
sys.path.append("..")
import fmplib as fmp
from fmplib.pyqueue import connect_to_queue

plt.rcParams['figure.figsize'] = (12, 4)
fmp.documentation_button()

## Exercise 1

Boolean arrays can pick subsets from vectors. Run and observe the code below to understand:
- Random arrays
- Boolean arrays
- Array indexing with boolean arrays
- Counting items

In [None]:
a = np.random.random(5)
b = np.arange(9)
c = np.array([True, False, True, False, False, True, True, False, False])

print('random array =', a)
print('b > 4 = ', b > 4)
print('b[c] = ', b[c])
print(np.count_nonzero(c))

Now do this:
- Create a vector `x` of length 30 with random numbers in the range $[-1.0, 1.0)$. Note that `np.random.random` returns numbers in the range $[0.0, 1.0)$.

- Make a stem plot of `x` using `plt.stem()` - [See Docs](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.stem).
- Create 3 vector subsets of `x`:
  - `y1` = all numbers in `x` greater than 0.5.
  - `y2` = all numbers in `x` less than -0.5.
  - `y3` = all numbers in `x` in between -0.5 and 0.5. For this, you will need `np.logical_and()`
- plot `y1`, `y2`, `y3` on three different figures using differently colored circle markers using the `markerfmt` optional arg. [The plt.plot docs](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot) show the marker type and marker color options.

In [None]:
# solution

x = np.random.random(30) * 2 - 1
plt.stem(x)

y1 = x[x > .5]
y2 = x[x < -.5]
y3 = x[np.logical_and( x > -.5, x < .5)]

plt.figure()
plt.subplot(1,3,1)
plt.stem(y1)
plt.ylim(-1, 1)
plt.subplot(1,3,2)
plt.stem(y2, markerfmt='ro')
plt.ylim(-1, 1)
plt.subplot(1,3,3)
plt.stem(y3, markerfmt='go')
plt.ylim(-1, 1);

In [None]:
connect_to_queue()

## Exercise 2

Load ~10 seconds from an audio file on your computer into array `x`:
- Use Audacity to load any audio file
- Trim to 10 seconds
- Change the Sample Rate to 22050
- Convert from Stereo to Mono
- Export as WAV

Recall that the total energy in a signal $x[n]$ of length $N$ is:
$$ E = \sum_{n=0}^{N-1} x[n]^2 $$

- Calculate the total energy for `x`.
- Calculate the average energy for `x`.

In [None]:
# x = fmp.load_wav("...")


In [None]:
# solution 

x = fmp.load_wav("audio/vegtown.wav")

e = np.dot(x, x)
ave_e = e / len(x)
print(f'total: {e:.3f}\nave: {ave_e:.3f}')

In [None]:
connect_to_queue()

## Exercise 3

Recall that the normalized energy signal is:
$$ \bar E^x[m] = \frac{1}{N} \sum_{n=mH}^{mH+N-1} x[n]^2 \text{ for } m \in [0, M-1] $$

Write the function `calc_energy_feature()` with parameters:  
- `x`: $x$, the signal of length $L$
- `win_len`: window length $N$
- `hop_size`: hop size $H$


The strategy for this algorithm is:
- calculate $M$
- create an empty energy array of length $M$
- in a `for` loop, calculate each value for $E[m]$ and assign it to the proper location in the energy array.
- return the array.

Plot the energy result. Try different values for $N$ and $H$ and find two that seem to work well.

In [None]:
def calc_energy_feature(x, win_len, hop_size):
    pass


In [None]:
# solution
def calc_energy_feature(x, win_len, hop_size):
    
    x_len = len(x)
    M = int( np.ceil(x_len / hop_size ) )
    
    out = np.empty(M)
    
    for m in np.arange(M):
        win = x[m * hop_size: m * hop_size + win_len]
        out[m] = np.dot(win, win) / win_len
    
    return out

win_len = 1400
hop_size = 700
e = calc_energy_feature(x, win_len, hop_size)
plt.plot(e);

In [None]:
connect_to_queue()

## Exercise 4

We now convert the energy signal into a _Energy Novetly Function_ (ENF):

$$\Delta[n] = \vert \bar E^x[n+1] - \bar E^x[n] \vert_{\ge 0}$$

Write the function `calc_enf(e)` with parameter:  

- `e`: The energy feature signal $\bar E^x$

Strategy:
- Take the discrete-time derivative of $\bar E^x[n]$. 
- Set all negative values to 0. This can be done by using boolean indexing as shown in the example below.

Then create the ENF for your signal and plot it.

In [None]:
a = np.arange(8)
a[a < 5] = 99
print(a)

# calculate the energy novelty function from an energy feature signal
def calc_enf(e):
    pass

In [None]:
# solution
def calc_enf(e):
    de = e[1:] - e[:-1]
    de [ de < 0 ] = 0
    return de

enf = calc_enf(e)
plt.plot(enf);

In [None]:
connect_to_queue()

## Exercise 5

Hopefully, your ENF plot has a number of sharp peaks. To find the locations of these peaks, we use the peak detection function below.

- Use `fmp.find_peaks(x, thresh)` to locate the peaks of your ENF.
- Plot the peaks on top of the ENF. There are a few ways to visualize peaks. I like placing a red circle at the peak location. `plt.plot(xs, ys, 'ro')` will place a red 'o' at the locations specified by `xs` and `ys` (a point at `(xs[0], ys[0])`, `(xs[1], ys[1])`, etc...). 
- Run the code several times with different values for `thresh` so that you get "just the good peaks".


In [None]:
# solution
# call find_peaks and plot results...

peaks = fmp.find_peaks(enf, thresh=0.5)

plt.plot(enf)
plt.plot(peaks, enf[peaks], 'ro')
plt.show()

In [None]:
connect_to_queue()

## Exercise 6

Now that we have good peak locations for onsets, we will synthesize a click track corresponding to the peaks.

Create a function `sonify(locs, snd)` which will place copies of the waveform `snd` at each location specified by `locs`.
- Create an output array of zeros (`np.zeros`) of the appropriate length. 
- Loop through each sample in `locs` and add `snd` into the proper location of the output array.
- Important: the `locs` array must be in units of the sample rate (22050). But the peaks you generated are sampled at a different rate.

You can listen to the resulting clicks with `ipd.Audio()`.  

Even better is listening to the original audio `x` and the click track at the same time. To do that, provide a list of two arrays to `ipd.Audio()` and you will hear each array from a different speaker. Both arrays must be of the same length. You can adjust the relative volume of clicks to audio by scaling either signal by some constant factor.


In [None]:
fs = 22050
click_wav = fmp.load_wav('audio/click.wav')

def sonify(locs, snd):
    pass


In [None]:
# solution

# first convert from feature sampling rate to audio sampling rate:
click_locs = peaks * hop_size

def sonify(locs, snd):
    snd_len = len(snd)
    o_len = locs[-1] + snd_len
    output = np.zeros(o_len)
    
    for l in locs:
        output[l:l+snd_len] += snd
    
    return output

y = sonify(click_locs, click_wav)

y.resize(len(x))
ipd.Audio([x * 0.25, y], rate=22050)

You may now do a checkoff, but continue on to Exercise 7 if you have time.

In [None]:
connect_to_queue('checkoff')

## Exercise 7

If you listen very carefully, you may notice that the click sounds are a tiny bit off from the music. If you can't hear this, increase `win_len` to make the effect more pronounced.

Why are the click sounds a bit off? How can this be fixed? 

Try making some adjustments in the code to fix this issue.

In [None]:
# solution
win_len = 1400
hop_size = 700
e = calc_energy_feature(x, win_len, hop_size)
enf = calc_enf(e)
peaks = fmp.find_peaks(enf, thresh=0.5)


# there are two potential fixes to the timing:
# 1) the energy feature signal did not use a centered window. This results in a "forward bias" of win_len/2
# 2) the discrete derivative looks at n+1 vs n, so that can also create a forward bias

y = sonify(peaks * hop_size + win_len // 2 + hop_size // 2, click_wav)
y.resize(len(x))
ipd.Audio([x * 0.25, y], rate=22050)
